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Abstract. Analysis of execution traces plays a fundamental role in many program anal- 
ysis approaches, such as runtime verification, testing, monitoring, and specification min- 
ing. Execution traces are frequently parametric, i.e., they contain events with param- 
eter bindings. Each parametric trace usually consists of many meaningful trace slices 
merged together, each slice corresponding to one parameter binding. For example, a Java 
program creating iterator objects ii and 12 over collection object ci may yield a trace 
createlter(ci ii) next(ji) createlter(ci 12) updateColl(ci) next(ii) parametric in collection c and 
iterator i, whose slices corresponding to instances "c, i h-)- ci,ii" and "c, i £1,12" are 
createlter(ci ii) next(ii) updateColl(ci) next(ii) and, respectively, createlter{ci 12) updateColl(ci) 
Several approaches have been proposed to specify and dynamically analyze parametric 
properties, but they have limitations: some in the specification formalism, others in the 
type of trace they support. Not unexpectedly, the existing approaches share common no- 
tions, intuitions, and even techniques and algorithms, suggesting that a fundamental study 
and understanding of parametric trace analysis is necessary. 

This foundational paper aims at giving a semantics-based solution to parametric trace 
analysis that is unrestricted by the type of parametric property or trace that can be 
analyzed. Our approach is based on a rigorous understanding of what a parametric 
trace/property /monitor is and /low it relates to its non-parametric counter-part. A general- 
purpose parametric trace slicing technique is introduced, which takes each event in the 
parametric trace and dispatches it to its corresponding trace slices. This parametric trace 
slicing technique can be used in combination with any conventional, non-parametric trace 
analysis technique, by applying the later on each trace slice. As an instance, a parametric 
property monitoring technique is then presented, which processes each trace slice online. 
Thanks to the generality of parametric trace slicing, the parametric property monitoring 
technique reduces to encapsulating and indexing unrestricted and well-understood non- 
parametric property monitors (e.g., finite or push-down automata). 

The presented parametric trace slicing and monitoring techniques have been imple- 
mented and extensively evaluated. Measurements of runtime overhead confirm that the 
generality of the discussed techniques does not come at a performance expense when com- 
pared with existing parametric trace monitoring systems. 
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Figure 1: Typestate property describing the correct use of the next() and hasnext() methods. 

1. Introduction and Motivation 

Parametric traces, i.e., traces containing events with parameter bindings, abound in pro- 
gram executions, because they naturally appear whenever abstract parameters (e.g., variable 
names) are bound to concrete data (e.g., heap objects) at runtime. In this section we first 
discuss some motivating examples and describe the problem addressed in this paper, then 
we recall related work and put the work in this paper in context, and then we explain our 
contributions and finally the structure of the paper. 

1.1. Motivating examples and highlights. We here describe three examples of para- 
metric properties, in increasing difficulty order, and use them to highlight and motivate the 
semantic results and the algorithms presented in the rest of the paper. 

Typestates [34j refine the notion of type by stating not only what operations are allowed 
by a particular object, but also what operations are allowed in what contexts. Typestates 
are particular parametric properties with only one parameter. Figured] shows the typestate 
description for a property saying that it is invalid to call the next() method on an iterator 
object when there are no more elements in the underlying collection, i.e., when hasnext() 
returns false, or when it is unknown if there are more elements in the collection, i.e., hasnext() 
is not called. From the unknown state, it is always an error to call the next() method because 
such an operation could be unsafe. If hasnext() is called and returns true, it is safe to call 
next(), so the typestate enters the more state. If, however, the hasnext() method returns false, 
there are no more elements, and the typestate enters the none state. In the more and none 
states, calling the hasnext() method provides no new information. It is safe to call next() from 
the more state, but it becomes unknown if more elements exist, so the typestate reenters 
the initial unknown state. Finally, calling next() from the none state results in an error. For 
simplicity, we here assume that next() is the only means to modify the state of an iterator; 
concurrent modifications are discussed in other examples shortly. 

It is straightforward to represent the typestate property in Figure [H and all types- 
tate properties, as particular (one-parameter) parametric properties. Indeed, the behaviors 
described by the typestate in Figure [1] are intended to be obeyed by all iterator object in- 
stances; that is, we have a property parametric in the iterator. To make this more precise. 



PARAMETRIC MONITORING 



3 



let us look at the problem from the perspective of observable program execution traces. A 
trace can be regarded as a sequence of events relevant to the property of interest, in our 
case calls to next() or to hasnext(); the latter can be further split into two categories, one when 
hasnextO returns true and the other when it returns false. Since the individuality of each 
iterator matters, we must regard each event as being parametric in the iterator yielding 
it. Formally, traces relevant to our typestate property are formed with three parametric 
events, namely next(i), hasnexttrue(i), and hasnextfalse(i). A possible trace can be hasnexttrue{ii) 
hasnextfalse(i2) next(ii) next(i2)..., which violates the typestate property for iterator instance 
Z2. How to obtain execution traces is not our concern here (several runtime monitoring 
systems use Aspect J instrumentation). Our results in this paper are concerned with how to 
specify properties over parametric traces, what is their meaning, and how to monitor them. 

Let us first briefly discuss our approach to specifying properties over parametric exe- 
cution traces, that is, parametric properties. To keep it as easy as possible for the user and 
to leverage our knowledge on specifying ordinary, non-parametric properties, we build our 
specification approach on top of any formalism for specifying non-parametric properties. 
More precisely, all one has to do is to first specify the property using any conventional for- 
malism as if there were only one possible instance of its parameters, and then use a special 
A quantifier to make it parametric. For our typestate example, suppose that typestate is 
the finite state machine in Figure [U modified by replacing the method calls on edges with 
actual events as described above. Then the desired parametric property is Ai . typestate. The 
meaning of this parametric property is that whatever was intended for its non-parametric 
counterpart, typestate, must hold for each parameter instance; that is, the error state must 
not be reached for any iterator instance, which is precisely the desired meaning of this type- 
state. Another way to specify the same property is using a regular expression matching all 
the good behaviors, each bad prefix for any instance thus signaling a violation: 



Yet another way to specify its non-parametric part is with a linear-temporal logic formula: 



The above LTL formula says "it is always (□) the case that each next(i) event is preceded 
(0) by a hasnexttrue(i). The LTL formula must hold for any iterator instance. In general, 
if AX . P is a parametric property, where X is a set of one or more parameters, we may 
call P its corresponding base or root or non-parametric property. In this paper we develop 
a mathematical foundation for specifying such parametric properties independently of the 
formalism used for specifying their non-parametric part, define their precise semantics, pro- 
vide algorithms for online monitoring of parametric properties, and finally bring empirical 
evidence showing that monitoring parametric properties is in fact feasible. 

Parametric properties properly generalize typestates in two different directions. First, 
parametric properties allow more than one parameter, allowing us to specify not only prop- 
erties about a given object such as the typestate example above, but also properties that 
capture relationships between objects. Second, they allow us to specify infinite-state root 
properties using formalisms like context-free grammars (see Sections [231 [2?5] and [2]6]) . 

Let us now consider a two-parameter property. Suppose that one is interested in an- 
alyzing collections and iterators in Java. Then execution traces of interest may contain 
events createlter(ci) (iterator i is created for collections c), updateColl(c) (c is modified), and 
next(i) {i is accessed using its next element method), instantiated for particular collection 



Ai . (hasnexttrue(i)^next(i) | hasnextfalse(i)*)* 
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and iterator instances. Most properties of parametric traces are also parametric; for our 
example, a property may be "collections are not allowed to change while accessed through 

iterators", which is parametric in a collection and an iterator. The parametric property 
above expressed as a regular expression (here matches mean violations) can be 

Ac, j . createlter(ci) next(i)* updateColl(c)^ next(i) 

From here on, when we know the number and types of parameters of each event, we omit 
writing them in parametric properties, because they are redundant; for example, we write 

Ac, i . createlter next* updateColl"^ next 

Parametric properties, unfortunately, are very hard to formally verify and validate 
against real systems, mainly because of their dynamic nature and potentially huge or even 
unlimited number of parameter bindings. Let us extend the above example: in Java, one 
may create a collection from a map and use the collection's iterator to operate on the map's 
elements. A similar safety property is: "maps are not allowed to change while accessed 
indirectly through iterators". Its violation pattern is: 

Am,c,i. createColl (updateMap | updateColl)* createlter next*(updateMap | updateColl)"'" next 

with two new parametric events createColl (m c) (collection c is created from map m) and 
updateMap(m) (m is updated). All the events used in this property provide only partial 
parameter bindings (createColl binds only m and c, etc.), and parameter bindings carried by 
different events may be combined into larger bindings; e.g., createColl(mi ci) can be combined 
with createlter(ci ii) into a full binding (mi ci ii), and also with createlter(ci ^2) into (mi ci ^2)- 
It is highly challenging for a trace analysis technique to correctly and efHciently maintain, 
locate and combine trace slices for different parameter bindings, especially when the trace 
is long and the number of parameter bindings is large. 

This paper addresses the problem of parametric trace analysis from a foundational, 
semantic perspective: 

Given a parametric trace r and a parametric property AX . P, what does it 
mean for t to be a good or a bad trace for AX . P? How can we show it? 
How can we leverage, to the parametric case, our knowledge and techniques 
to analyze conventional, non-parametric traces against conventional, non- 
parametric properties? 

In this paper we first formulate and then rigorously answer and empirically validate our 
answer to these questions, in the context of runtime verification. In doing so, a technique 
for trace slicing is also presented and shown correct, which we regard as one of the central 
results in parametric trace analysis. In short, our overall approach to monitor a parametric 
property AX . P is to observe the parametric trace as it is being generated by the running 
system, slice it online with respect to the used parameter instances, and then send each 
slice piece-wise to a non-parametric monitor corresponding to the base property P; this 
way, multiple monitor instances for P can and typically do coexist, one for each trace slice. 

The main conceptual limitation of our approach is that the parametrization of properties 
is only allowed at the top-level, that is, the base property P in the parametric property 
AX . P cannot have any A binders. In other words, we do not consider nested parameters. 
To allow nested parameters one needs a syntax for properties, so that one can incorporate 
the syntax for parameters within the syntax for properties. However, one of our major goals 
is to be formalism-independent, which means that, by the nature of the problem that we 
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are attempting to solve, we can only parameterize properties at the top. Many runtime 
verification approaches deliberately accept the same limitation, as discussed below, because 
arbitrarily nested parameters are harder to understand and turn out to generate higher 
runtime overhead in the systems supporting them. 

Our concrete contributions are explained after the related work. 

1.2. Related Work. We here discuss several major approaches that have been proposed so 
far to specify and monitor parametric properties, and relate them to our subsequent results 
in this paper. It is worth mentioning upfront that, except for the MOP approach [28| which 
motivated and inspired the work in this paper, the existing approaches do not follow the 
general methodology proposed by our approach in this paper. More precisely, they employ 
a monolithic monitoring approach, where one monitor is associated to each parametric 
property; the monitor receives and handles each parametric event in a formalism-specific 
way. In contrast, our approach is to generate multiple local monitors, each keeping track 
of one parameter instance. Our approach leads not only to a lower runtime overhead as 
empirically shown in Section [9l but it also allows us to separate concerns (i.e., the parameter 
handling from the specification formalism and monitor synthesis for the basic property) and 
thus potentially enabling a broader spectrum of optimizations that work for various different 
property specification formalisms and corresponding monitors. 

Tracematches [H H] is an extension of AspectJ [2] supporting parametric regular pat- 
terns; when patterns are matched during the execution, user-defined advice can be triggered. 
J-LO [9] is a variation of Tracematches that supports a first-order extension variant of linear 
temporal logic (LTL) that supports data parametrization by means of quantifiers [33j; the 
user-provided actions are executed when the LTL properties are violated. Also based on 
AspectJ, [24J proposes Live Sequence Charts (LSC) [T7] as an inter-object scenario-based 
specification formalism; LSC is implicitly parametric, requiring parameter bindings at run- 
time. Tracematches, J-LO and LSC [24] support a limited number of parameters, and each 
has its own approach to handle parameters, specific to its particular specification formalism. 
Our semantics-based approach in this paper is generic in the specification formalism and 
admits, in theory, a potentially unlimited number of parameters. In spite of the generality 
of our theoretical results, we chose in our current implementations (see Section [9|) to also 
support only a bounded number of parameters, like in the aforementioned approaches. 

JavaMOP [28] [U] (http://javainop.org) is a parametric specification and monitoring 
system that is generic in the specification formalism for base properties, each formalism 
being included as a logic plugin. Monitoring code is generated from parametric specifications 
and woven within the original Java program, also using AspectJ, but using a different 
approach that allows it to encapsulate monitors for non-parametric properties as blackboxes. 
Until recently, JavaMOP 's genericity came at a price: it could only monitor execution traces 
in which the first event in each slice instantiated all the property parameters. This limitation 
prevented the JavaMOP system presented in from monitoring some basic parametric 
properties, including ones discussed in this paper. Our novel approach to parametric trace 
slicing and monitoring discussed in this paper does not have that limitation anymore. The 
parametric slicing and monitoring technique discussed in this paper has been incorporated 
both in JavaMOP and in its commercial-grade successor RV ^27j, together with several 
optimizations that we do not discuss here; Section [9] discusses experiments done with both 
these systems, as well as with Tracematches, for comparison, because Tracematches has 
proven to be the most efficient runtime verification system besides JavaMOP. 
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Program Query Language (PQL) |25j allows the specification and monitoring of para- 
metric context-free grammar (CFG) patterns. Unlike the approaches above that only allow 
a bounded number of property parameters, PQL can associate parameters with sub-patterns 
that can be recursively matched at runtime, yielding a potentially unbounded number of 
parameters. PQL's approach to parametric monitoring is specific to its particular CFG- 
based specification formalism. Also, PQL's design does not support arbitrary execution 
traces. For example, field updates and method begins are not observable; to circumvent the 
latter, PQL allows for observing traces local to method calls. Like PQL, our technique also 
allows an unlimited number of parameters (but as mentioned above, our current implemen- 
tation supports only a bounded number of parameters). Unlike PQL, our semantics and 
techniques are not limited to particular events, and are generic in the property specification 
formalism; CFGs are just one such possible formalism. 

Eagle [5], RuleR [6], and Program Trace Query Language (PTQL) [19] are very general 
trace specification and monitoring systems, whose specification formalisms allow complex 
properties with parameter bindings anywhere in the specification (not only at the beginning, 
like we do). Eagle and RuleR are based on fixed-point logics and rewrite rules, while PTQL 
is based on SQL relational queries. These systems tackle a different aspect of generality 
than we do: they attempt to define general specification formalisms supporting data binding 
among many other features, while we attempt to define a general parameterization approach 
that is logic- independent. As discussed in [H EU [I^ (Eagle and PQL cases), the very 
general specification formalisms tend to be slower; this is not surprising, as the more general 
the formalism the less the potential for optimizations. Our techniques can be used as an 
optimization for certain common types of properties expressible in these systems: use any 
of these to specify the base property P, then use our generic techniques to analyze AX . P. 

1.3. Contributions. Besides proposing a formal semantics to parametric traces, proper- 
ties, and monitoring, we make two theoretical contributions and discuss implementations 
that validate them empirically: 

(1) Our first result is a general-purpose online parametric trace slicing algorithm (algorithm 
A{X) in Section [6]) together with its proof of correctness (Theorem [T]) , positively an- 
swering the following question: given a parametric execution trace, can one effectively 
find the slices corresponding to each parameter instance without having to traverse the 
trace for each instance? 

(2) Our second result, building upon the slicing algorithm, is an online monitoring tech- 
nique (algorithms M{X) and C{X) in Section [8]) together with its proof of correctness 
(Theorems [2] and [3]) , which positively answers the following question: is it possible to 
monitor arbitrary parametric properties h.X . P against parametric traces, provided that 
the root property P is monitorable using conventional monitors? 

(3) Finally, our implementation of these techniques in the JavaMOP and RV systems pos- 
itively answers the following question: can we implement general purpose and unre- 
stricted parametric property monitoring tools which are comparable in performance with 
or even outperform existing parametric property monitoring tools on the restricted types 
of properties and/or traces that the latter support? 

Preliminary results reported in this paper have been published in a less polished form as a 
technical report in summer 2008 ^30j. Then a shorter, conference paper was presented at 
TACAS 2009 in York, U.K. [E]. This extended paper differs from [16] as follows: 
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(1) It defines all the mathematical infrastructure needed to prove the results claimed in 
[16j . For example, Section [3] is new. 

(2) It expands the results in |16] and includes all their proofs, as well as additional results 
needed for those proofs. For example, Section [5] is new. 

(3) It discusses more examples of parametric properties. For example, Section [2] is new. 

(4) The implementation section in |16j presented an incipient implementation of our tech- 
nique in a prototype system called PMon there (from Parametric Monitoring). In the 
meanwhile, we have implemented the technique described in this paper as an integral 
part of the runtime verification systems JavaMOP (http : // j avamop . org) and RV [27j . 
The implementation section (Sectiorll]) now refers to these systems. 

1.4. Paper Structure. Section [2] discusses examples of parametric properties. Section [3] 
provides the mathematical background needed to formalize the concepts introduced later in 
the paper. Section U] formalizes parametric events, traces and properties, and defines trace 
slicing. Section [5] establishes a tight connection between the parameter instances in a trace 
and the parameter instance used for slicing. Sections [6l [7] and [8] discuss our main techniques 
for parametric trace slicing and monitoring, and prove them correct. Section [9] discusses 
implementations of these techniques in two related systems, JavaMOP and RV. Section [TOl 
concludes and proposes future work. 

2. Examples of Parametric Properties 

In this section we discuss several examples of parametric properties. Our purpose here is 
twofold. On the one hand we give the reader additional intuition and motivation for the 
subsequent semantics and algorithms, and, on the other hand, we justify the generality 
of our approach with respect to the employed specification formalism for trace properties. 
The discussed examples of parametric properties are defined using various trace specification 
formalisms, some with more than one parameter and some with more than validating and/or 
violating categories of behaviors. For each of the examples, we give hints on how our 
subsequent techniques in Sections [6] and [8] work. In order to explain the examples in this 
section we also informally introduce necessary notions, such as events and traces (both 
parametric and non-parametric) ; all these notions will be formally defined in Section HI 

For each example, we also discuss which of the existing runtime verification systems 
can support it. Note that JavaMOP [28] and its commercial-grade successor RV [27], which 
build upon the trace slicing and monitoring techniques presented in this paper, are the only 
runtime verification systems that support all the parametric properties discussed below. 

2.1. Releasing acquired resources. Consider a certain type of resource (e.g., synchro- 
nization objects) that can be acquired and released by a given procedure, and suppose 
that we want the resources of this type to always be explicitly released by the procedure 
whenever acquired and only then. This example will be broken in subparts and used as a 
running example in Section [4] to introduce our main notions and notations. 

Let us first consider the non-parametric case in which we have only one resource. Sup- 
posing that the four events of interest, i.e., the begin/end of the procedure and the ac- 
quire/release of the resource, are £ = {begin, end, acquire, release}, then the following regular 



8 



G. RO§U AND F. CHEN 



pattern P captures the desired behavior requirements: 

P = (begin(e | (acquire(acquire | release)* release)) end)* 

The above regular pattern states that the procedure can take place multiple times and, if the 
resource is acquired then it is released by the end of the procedure (e is the empty word). For 
simplicity, we here assume that the procedure is not recursive and that the resource can be 
acquired and released multiple times, with the effect of acquiring and respectively releasing it 
precisely once; Section [2^ shows how to use a context-free pattern to specify possibly recur- 
sive procedures with matched acquire/release events within each procedure invocation. One 
matching execution trace for this property is, e.g., begin acquire acquire release end begin end. 

Let us now consider the parametric case in which we may have more than one resource 
and we want each of them to obey the requirements specified above. Now the events acquire 
and release are parametric in the resource being acquired or released, that is, they have 
the form acquire(ri), release(r2), etc. The begin/end events take no parameters, so we write 
them begin() and end(). A parametric trace r for our running example can be the following: 

r = begin() acquire(ri) acquire(r2) acquire(ri) release(ri) end() begin() acquire(r2) release(r2) end() 

This trace involves two resources, ri and r2, and it really consists of two trace slices merged 
together, one for each resource: 

(ri) : begin acquire acquire release end begin end 
(r2) : begin acquire end begin acquire release end 

The begin and end events belong to both trace slices. Since we know the parameter instance 
for each trace slice and we know the types of parameters for each event, to avoid clutter we 
do not mention the redundant parameter bindings of events in trace slices. 

Our trace slicing algorithm discussed in Section [6] processes the parametric trace only 
once, traversing it from the first parametric event to the last, incrementally calculating 
a collection of meaningful trace slices so that it can quickly identify and report the slice 
corresponding to any parameter instance when requested. 

Note that the (ri) trace slice matches the specification P above, while the (r2) trace 
slice does not. To distinguish parametric properties referring to multiple trace slices from 
ordinary properties, we explicitly list the parameters using a special A binder. For example, 
our property above parametric in the resource r is Ar . P, or 

Ar . (begin (e | (acquire (acquire | release)* release)) end)* 

Both Tracematches IXl H] and JavaMOP [28] can specify/monitor such parametric regular 
properties, the latter using its extended-regular expression (ERE) plugin. 

For the sake of a terminology, P is called a non-parametric, or a root, or a basic property, 
in contrast to Ar . P, which is called a parametric property. As detailed in Section U 
parametric properties are functions taking a parametric trace (e.g., r) and a parameter 
instance (e.g., r i-^ ri or r i— )• r2) into a verdict category for the basic property P (e.g., 
match or fail). In our case, the semantics of our parametric property Ar . P takes parametric 
trace r and parameter instance r ^ ri to match, and takes r and r i— )• r2 to fail, that is, 

(Ar . P)(T)(r ri) = match 
(Ar . P){T){r ^ r2) = fail 

Our parametric monitoring algorithm in Section [8] reports a fail for instance r i— >■ r2 precisely 
when the first end event is encountered. 
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We would like to make two observations at this stage. First, as we already mentioned, 
we only parameterize a property at the top, that is, the A binder cannot be used inside the 
basic property. Indeed, since we do not enforce any particular syntax for basic properties, 
it is not clear how to mix the A binder with the inexistent property constructs. Second, 
one should not confuse our parameters with universally quantified variables. While in our 
example above A may feel like a universal quantifier, note that one may prefer to specify 
the same parametric property in a more negative fashion, for example to specify the bad 
behaviors instead of the positive ones. Relying on the fact that the begin and end events 
must be correctly matched, one can only state the bad patterns, which are a begin followed 
by a release and an acquire followed by an end: 

Ar . (<f^* (begin release | acquire end) f*) 

The right way to regard a parametric property is as one indexed by all possible instances 
of the parameters, each instance having its own interpretation of the trace (only caring of 
the events relevant to it), which is orthogonal to the other instances' interpretations. 

2.2. Authenticate before use. Consider a server authenticating and using keys, say ki, 
/c2, fca, etc., whose execution traces contain events authenticate(A;i), use(A;2), etc. A possible 
trace of such a system can be 

T = authenticate(A;i) authenticate(A:3) use(A:3) use(A:2) authenticate(A;2) use(A;i) use{k2) use(/c3) 

A parametric property for such a system can be "each key must be authenticated before use" , 
which, using linear temporal logic (LTL) as a specification formalism for the corresponding 
base property, can be expressed as 

Ak . □(use «>authenticate) 

Such parametric LTL properties can be expressed in both J-LO and JavaMOP [281 El] 
(the later using its LTL logic plugin). For the trace above, the trace slice corresponding to 
is authenticate use use corresponding to the parametric subtrace authenticate(/c3) use(A:3) use(A;3) 
of events relevant to k^ in r, but keeping only the base events; also, the trace slice corre- 
sponding to /c2 is use authenticate use. Our trace slicing algorithmin Section [6] can detect 
these slices. Moreover, with the finite trace LTL semantics in [32] . 

(AA; . □(use — > <;>authenticate))(r)(A; k^) = true 
(AA; . □(use — > <;>authenticate))(r)(A; /c2) = false 

Our parametric monitoring algorithm in Section [8] reports a violation for instance k ^ k2 
precisely when the first use(A;2) is encountered. 

2.3. Safe iterators. Consider the following property for iterators created over vectors: 
when an iterator is created for a vector, one is not allowed to modify the vector while its 
elements are traversed using the iterator. The JVM usually throws a runtime exception 
when this occurs, but the exception is not guaranteed in a multi-threaded environment. 
Supposing that parametric event create(f i) is generated when iterator i is created for vector 

update(?;) is generated when v is modified, and next(z) is generated when i is accessed 
using its "next element" interface, then one can write it as the parametric regular property 

hv, i . create next* update^ next. 
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Such parametric regular expression properties can be expressed in both Tracematches [T] 
and JavaMOP \28\ I14j (the latter using its ERE plugin). We here assumed that the match- 
ing of the regular expression corresponds to violation of the base property. Thus, the 
parametric property is violated by a given trace and a given parameter instance whenever 
the regular pattern above is matched by the corresponding trace slice. For example, if 
r = create(wi ii) next(?i) create(t'i ^2) update(?7i) next(«i) is a parametric trace where two iter- 
ators are created for a vector, then the slice corresponding to {vi ii) is create next update next 
and the one corresponding to {vi 12) is create update, so r violates the parametric property 
(i.e., matches the regular pattern above) on instance {vi ii), but not on instance {vi ^2)- Note 
that in this example there are more than one parameters in events, traces and property, 
namely a vector and an iterator. Indeed, the main difficulty of our techniques in Sections 
[6] and [8] was precisely to handle general purpose parametric properties with an arbitrary 
number of parameters. The slicing algorithm in Section [6] processes parametric traces and 
maintains enough slicing information so that, when asked to produce slices corresponding 
to particular parameter instances, e.g., to {vi ^2), it can do so without any further analysis 
of the trace. Also, in this case, the monitoring algorithm in Section [8] reports a match each 
time a parameter instance yields a matching trace slice. 

2.4. Correct locking. Consider a custom implementation of synchronization in which one 
can acquire and release locks manually (like in Java 5 and later versions). A basic prop- 
erty is that each function releases each lock as many times as it acquires it. Assuming 
that the executing code is always inside some function (like in Java, C, etc.), that begin() 
and end() events are generated whenever function executions are started and terminated, 
respectively, and that acquire(/) and release(/) events are generated whenever lock I is ac- 
quired or released, respectively, then one can specify this safety property using the following 
parametric context-free grammar (CFG) pattern: 

A/ . 5 — )• begin S end | S acquire S release j e 

Such parametric CFG properties can be expressed in both PQL |25] and JavaMOP [281 ISS] 
(the later using its CFG plugin). We here borrow the CFG property semantics of the CFG 
plugin of JavaMOP (and also RV [27]) in [26], that is, this parametric property is violated 
by a parametric execution with a given parameter instance (i.e., concrete lock) whenever the 
corresponding trace slice cannot be completed into one accepted by the grammar language. 
While this property can be expressed in JavaMOP and even monitored in its non-parametric 
form, the previous implementation of JavaMOP in [26] cannot monitor it as a parametric 
property because its violating traces most likely start with a property-relevant begin() event, 
which does not contain a lock parameter; therefore, the previous limitation of JavaMOP 
(allowing only events that instantiate all property's parameters to create a monitor instance) 
did not allow us to monitor this natural CFG property. To circumvent this limitation, [26] 
proposed a different way to specify this property, in which the violating traces started with 
an acquire(/) event. We do not need such artificial encodings anymore in the new version of 
JavaMOP, and they were never needed in RV (RV improves the new JavaMOP). 

For profiling reasons, one may also want to take notice of validations, or matches of 
the property, as well as matches followed by violation, etc.; one can therefore have different 
interpretations of CFG patterns as base properties, classifying traces into various categories. 
What is different in this example, compared to the previous ones, is that the non-parametric 
property cannot be implemented as a finite state machine. With the CFG monitoring 
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algorithm proposed in [26] used to monitor the base property, our parametric monitoring 
algorithm in Section [8] reports a violation of this parametric CFG property as soon as a 
parameter instance is detected for which the corresponding trace slice has no future, that 
is, it admits no continuation into a trace in the language of the grammar. 

2.5. Safe resource use by safe client. A client can use a resource only within a given 
procedure and, when that happens, both the client and the resource must have been pre- 
viously authenticated as part of that procedure. Assuming the procedure fixed and given, 
this is a property over traces with five types of events: begin and end of the procedure 
(begin() and end()), authenticate of client (auth-client(c)) or of resource (auth-resource(r)), and 
use of resource by client (use(r c)). Using the past time linear temporal logic with calls and 
returns (ptCaRet) in [31j, one would write it as follows: 

Ar, c. use begin) A -i((-iauth-client) S begin) A -i((-iauth-resource) S begin)) 

The overlined operators are abstract variants of temporal operators, in the sense that they 
are defined on traces that collapse terminated procedure calls (erase subtraces bounded 
by matched begin/end events). For example, begin" holds only within a procedure call, 
because all the nested and terminated procedure calls are abstracted away. In words, the 
above says: if one sees the use of the resource (use) then that must take place within the 
procedure and it is not the case that one did not see, within the main procedure since its 
latest invocation, the authentication of the client or the authentication of the resource. 

JavaMOP can express this property using its ptCaRet logic plugin ^31j. However, until 
recently [28], JavaMOP could again only monitor it in its non-parametric form, because of 
its previous limitation allowing only completely parameterized events to create monitors. 
Even though it may appear that this property can only be violated when a completely 
parameterized use(rc) event is observed, in fact, the monitor must already exist at that 
point in the execution and "know" whether the client and the resource have authenticated 
since the begin of the current procedure; all the other events involved in the property 
are incompletely parameterized, so, unfortunately, this parametric property could not be 
monitored using the previous JavaMOP system, but it can be monitored with the new one. 

2.6. Success ratio. Consider now parametric traces with events success(a) and fail(a), say- 
ing whether a certain action o was successful or not. For a given action, a meaningful 
property can classify its (non-parametric) traces into an infinite number of categories, each 
representing a success ratio of the given action, which is a (rational) number s/t between 
and 1, where s is the number of success events in the trace and t is the total number 
of events in the trace. Then the corresponding parametric property over such parametric 
traces gives a success ratio for each action. We can specify such a property in JavaMOP 
and RV by making use of monitor variables and event actions [28]. Indeed, one can add 
two monitor variables, s and t, and then increment t in each event action and increment 
s only in the event action of the success event. The underlying parametric monitoring al- 
gorithm keeps separation between the various s and t monitor variables, one such pair for 
each distinct action a, guaranteeing that the correct ones will be accessed. 
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3. Mathematical Background: Partial Functions, Least Upper Bounds 

(lubs) and lub Closures 

In this section we first discuss some basic notions of partial functions and least upper bounds 
of them, then we introduce least upper bounds of sets of partial functions and least upper 
bound closures of sets of partial functions. This section is rather mathematical. We need 
these mathematical notions because it turns out that parameter instances are partial maps 
from the domain of parameters to the domain of parameter values. As shown later, whenever 
a new parametric event is observed, it needs to be dispatched to the interested parts (trace 
slices or monitors), and those parts updated accordingly: these informal operations can be 
rigorously formalized as existence of least upper bounds and least upper bound closures 
over parameter instances, i.e., partial functions. 

We recommend the reader who is only interested in our algorithms but not in the 
details of our subsequent proofs, to read the first two definitions and then jump to Section 
m returning to this section for more mathematical background only when needed. 

3.1. Partial Functions. This section discusses partial functions and (least) upper bounds 
over sets of partial functions. The notions and the results discussed in this section are 
broadly known, and many of their properties are folklore. They can be found in one shape 
or another in virtually any book on denotational semantics or domain theory. Since we need 
only a small subset of notions and results on partial functions and (least) upper bounds in 
this paper, and since we need to fix a uniform notation anyway, we prefer to define and 
prove everything we need and hereby also make our paper self-contained. 

We think of partial functions as "information carriers" : if a partial function 9 is defined 
on an element x of its domain, then "0 carries the information 9{x) about x G X". Some 
partial functions can carry more information than others; two or more partial functions can, 
together, carry compatible information, but can also carry incompatible information (when 
two or more of them disagree on the information they carry for a particular x G X). 

Definition 1. We let [X — >■ V] and [X —r V\ denote the sets of total and of partial 
functions from X to V ^ respectively. The domain of G [X ^ T/] is the set Dom(0) = 
{x G X I Q{x) is defined}. Let _L G [X ^ y] be the map undefined everywhere, that is, 
Dom(±) = 0. l{e,e' £[X^V] then: 

(1) 9 and 9' are are compatible if and only if 9{x) = 9'{x) for any x G Dom(0) n Dom(6''); 

(2) 9 is less informative than 9' , written 9 C 0', if for any x G X, 9{x) defined implies 
9'{x) also defined and 6'{x) = 9{x); 

(3) 9 is strictly less informative than 9', written 9 C 6', when 9 Q 9' and 9 ^ 9'. 

The relation of compatibility is reflexive and symmetric, but not transitive. When 
9,9' G [X ^ y] are compatible, we let 9 U 9' G [X ^ denote the partial function whose 
domain is Dom{9) U Dom{9') and which is defined as 9 or 9' in each element in its domain. 
The partial function 9 U 9' is called the least upper bound of 9 and 9'. We define least 
upper bounds more generally below. Also, note that 9 Q 9' and, respectively, 9 IZ 9', iff 9, 9' 
compatible and Dom(0) C Dom{9') and, respectively, Dom(0) C Dom(0'). 

Definition 2. Given ec[X^V] and 9' £[X^V], 

(1) 9' is an upper bound oi Q iS 9 Q 9' for any G G; has upper bounds iff there is 
a 9' which is an upper bound of G; 
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(2) 9' is the least upper bound (lub) of iff 0' is an upper bound of O and 0' C 0" for 
any other upper bound 9" of 0; 

(3) 9' is the maximum (max) of iff 6*' G and 9' is a lub of 0. 

A set of partial functions has an upper bound iff the partial functions in the set are 
pairwise compatible, that is, no two of them disagree on the value of any particular element 
in their domain. Least upper bounds and maximums may not always exist for any C 
[X^y]; if a lub or a maximum for exists, then it is, of course, unique, because C is a 
partial order, so antisymmetric. 

It is known that {[X —r y],C,_L) is a complete (i.e., any C-chain has a least upper 
bound) partial order with bottom (i.e., _L). 

Definition 3. Given C [X ^ y], let U0 and max0 be the lub and the max of 0, 
respectively, when they exist. When is finite, one may write U 02 U • • • U 0„ instead of 
U{01, ^2, • • • , Qn\- 

If has a maximum, then it also has a lub and U0 = max0. Here are several common 
properties that we use frequently in Sections 13.21 and 13.31 (these sections will present less 
known results with specific to our particular approach to parametric slicing and monitoring) : 

Proposition 1. The following hold {9,9i,92,03 G [X^V]): ±U9 exists and _L U 6* = 61; 
6*1 U 02 exists iff 6*2 U 9i exists, and, if they exist then 6*1 U 6*2 = 6'2 U ^i; U (^2 U ^3) exists 
iff (6*1 U 6*2) U 93 exists, and if they exist then 9i U (6*2 U 6*3) = (6*1 U ^2) U ^3. 

Proposition 2. Let C [X^F]. Then 

(1) has an upper bound iff for any ^1,^2 ^ and x €z X, if 9i{x) and 92{x) are defined 
then 6*1(2;) = 6*2 (x); 

(2) If has an upper bound then U0 exists and, for any x G X, 



(U0)(x) 



undefined if 9{x) is undefined for any G 
9{x) if there is a G with 9(x) defined. 



9\x) 



Proof. Since has an upper bound 9' £ [X ^V] 9 Q 9' for any 6* G 0, if 9i,92 G 
and X £ X are such that 9i{x) and 92{x) are defined then 9'{x) is also defined and 9i{x) = 
02{x) = 9'{x). Suppose now that for any ^1,^2 G and x G X, if 9i{x) and 92{x) are 
defined then 9i{x) = 92{x). All we need to show in order to prove both results is that we 
can find a lub for 0. Let 9' G [X^F] be defined as follows: for any x G X, let 

undefined if 9{x) is undefined for any G 
9{x) if there is a G with 9{x) defined 

First, 9' above is indeed well-defined, because we assumed that for any ^1,^2 G and 
X G X, if 9i{x) and 92{x) are defined then 9i{x) = 92{x). Second, 9' is an upper bound 
for 0: indeed, if G and x G X such that 9{x) is defined, then 9'{x) is also defined and 
9'{x) = 9{x), that is, 9 Q 9' for any G 0. Finally, 9' is a lub for 0: if 9" is another upper 
bound for and 9'(x) is defined for some x G X, that is, 9{x) is defined for some G 
and 9'{x) = 9{x), then 9"{x) is also defined and 9'{x) = 9{x) (as 9 C 9"), so 9' □ 9". □ 

Proposition 3. The following hold: 

(1) The empty set of partial functions C [X^y] has upper bounds and U0 = _L; 

(2) The one-element sets have upper bounds and LJ{0} = 9 for any G [X^l/]; 
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(3) The bottom "_L" does not influence the least upper bounds: U({±} U 0) = U0 for any 

e c [x^v]- 

(4) If e, G' C [X^y] such that UO' exists and for any 6* € 9 there is a 6*' G 9' with 9 □ 9', 
then IJ9 exists and U9 1^ u9'; for example, if U9' exists and 9 C 9' then U9 exists 
and U9 C U9'; 

(5) Let {9j}ig/ be a family of sets of partial functions with 9j C [X^y]. Then U U {9j | 
« G /} exists iff U{u9j | i G /} exists, and, if both exist, 

U U {9i I i G /} = U{u9i I i G /}. 

Proof. 1., 2. and 3. are straightforward. For 4-i since for each G 9 there is some 
9' G 9' with e C 9', and since 9' C U9' for any 6' G 9', it follows that 6 C U9' for 
any G 9, that is, that U9' is an upper bound for 9. Therefore, by Proposition [2] it 
follows that U9 exists and U9 C U9' (the latter because U9 is the least upper bound of 
9). We prove 5. by double implication, each implication stating that if one of the lub's 
exist then the other one also exists and one of the inclusions holds; that indeed implies 
that one of the lub's exists if and only if the other one exists and, if both exist, then they 
are equal. Suppose first that U U {9j | i G /} exists, that is, that U{9j \ i G 1} has an 
upper bound, say u. Since 9j C U{9j | i G /} for each i G /, it follows first that each 9^ 
also has u as an upper bound, so all U9j for alH G / exist, and second by 4- above that 
U9i Q UU{9j I z G /} for each i G I. Item 4- above then further implies that U{u9i | i G /} 
exists and U{u9j | i G /} E U{U U {9j | z G /}} = U U {9^ | z G /} (the last equality 
follows by 2. above). Conversely, suppose now that U{u9j | i G /} exists. Since for each 
9 G U{9j I -i G /} there is some z G / such that 9 C U9i (an i G / such that 9 G 9i), item 
4- above implies that U U {9j [ i G /} also exists and U U {9j | i G /} E U{u9j | i G /}. □ 

3.2. Least Upper Bounds of Families of Sets of Partial Maps. Motivated by re- 
quirements and optimizations of our trace slicing and monitoring algorithms in Sections [6] 
andm in this section and the next we define several less known notions and results. We are 
actually not aware of other places where these notions are defined, so they could be novel 
and specific to our approach to parametric trace slicing and monitoring. 

We first extend the notion of least upper bound (lub) from one associating a partial 
function to a set of partial functions to one associating a set of partial functions to a family 
(or set) of sets of partial functions: 

Definition 4. If {Qi}i£i is a family of sets in [X^F], then we let the least upper bound 

(also lub) of {9j}jg/ be defined as: 

U{9i I i G /} =^ {^{9i I i G /} I 6'i G 9i for each i G / such that U{6i\i G 1} exists}. 

As before, we use the infix notation when / is finite, e.g., we may write 9i U 92 U • • • U 9^ 
instead of U{9j | z G {1, 2, . . . , n}}. 

Therefore, U{9i | « G /} is the set containing all the lub's corresponding to sets formed 
by picking for each i G / precisely one element from 9^. Unlike for sets of partial functions, 
the lub's of families of sets of partial functions always exist; U{9j | i G /} is the empty set 
when no collection of 9i G 9^ can be found (one 9i G 9j for each i G I) such that {9i | i G /} 
has an upper bound. 
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There is an admitted slight notational ambiguity between the two least upper bound 
notations introduced so far. We prefer to purposely allow this ambiguity instead of inventing 
a new notation for the lub's of families of sets, hoping that the reader is able to quickly 
disambiguate the two by checking the types of the objects involved in the lub: if partial 
functions then the first lub is meant, if sets of partial functions then the second. Note 
that such notational ambiguities are actually common practice elsewhere; e.g., in a monoid 
(M, :MxM— >-M, 1) with binary operation * and unit 1, the * is commonly extended to 
sets of elements Mi, M2 in M as expected: Mi * M2 = {mi * 777,2 I "t-i G Mi, m2 G M2}. 

Proposition 4. The following facts hold, where 0, ©1,62, 63 ^ [X^l/]: 

(1) U0 = {!-}, where, in this case, the empty set C V{[X ^V\) is meant; 

(2) U{0} = 6; in particular U{0} = when the empty set C [X^y] is meant; 

(3) u{{n \eee} = \ J^®^ ^! ® ^ , , 

^ ^ ^ ' ' y \D if B does not have a lub; 

(4) U = 0, where the empty set C [X^y] is meant; 

(5) {±}ue = e; 

(6) If 61 C G2 then 61 U 63 C G2 U 63; in particular, if _L G 62 then G3 C G2 U ©3; 

(7) (Gi U G2) U ©3 = (©1 U ©3) U (02 U ©3). 

Proof. Recall that the least upper bound U{©j | z € /} of sets of sets of partial functions 
is built by collecting all the lubs of sets {9i | i G /} containing one element Oi from each of 
the sets ©j. When |/| = 0, that is when / is empty, there is precisely one set {9i \ i & /}, 
the empty set of partial functions. Then 1. follows by 1. in Proposition [3l When [/[ = 1, 
that is when {©j | z € /} = {©} for some © C [X^V] like in 2., then the sets {9i | i E /} 
are precisely the singleton sets corresponding to the elements of ©, so 2. follows by 2. 
in Proposition [3l 3. holds because there is only one way to pick an element from each 
singleton set {0}, namely to pick the 6 itself; this also shows how the notion of a lub of a 
family of sets generalizes the conventional notion of lub. When |/| > 2 and at least one of 
the involved sets of partial functions is empty, like in then there is no set {Oi | z G /}, so 
the least upper bound of the set of sets is empty (regarded, again, as the empty set of sets of 
partial functions). 5. follows by i. in Proposition [TJ The first part of 6. is immediate and 
the second part follows from the first using 5.. Finally, 7. follows by double implication: 
(©1 U ©3) U (©2 U ©3) C (©1 U ©2) U ©3 follows by 6. because ©1 and ©2 are included in 
©1 U ©2, and (©1 U ©2) U ©3 C (©^ U ©3) U (©2 U ©3) because for any 6*1 G ©1 U ©2, say 
9i G ©1, and any ^3 G ©3, if Oi U 6*3 exists then it also belongs to ©1 U ©3. □ 

Proposition 5. Let {©j}ig/ be a family of sets of partial maps in [X —r V\ and let X = 
{/jljgj be a partition of I: / = U{/j | j G J} and Ij^ n Ij^ = for any different ji,i2 ^ J ■ 
Then U{©i | i G 1} = U{U{©i^, | ij G Ij} \ j G J}. 

Proof. For each j G J, let Qj denote the set U{©j^. | ij G Ij}. Definition H] then implies 

the following: Qj ^= {U{0,j^. | ij G /j} | Oi- G ©j^ for each ij G Ij, such that U {Oi^ \ ij G 

Ij} exists}. Definition U] also implies the following: U{Qj | j G J} '= {^{qj | j G J} | 
qj G Qj for each j G J, such that U {qj | j G J} exists}. Putting the two equalities above 
together, we get that U{U{©j^. [ ij G /j} | j G J} equals the following: 

{ [A{U{Oi^ 1 ij G Ij} I J G J} I Oi- G Qi- for each j G J and ij G Ij, such that 

U{^ij I ij G Ij} exists for each j & J and 
U{U{6'i^. I ij G Ij} I j G J} exists }. 
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Since {Ij}j^j is a partition of /, the indexes ij generated by "for each j (z J and ij G Ij" 
cover precisely all the indexes i £ I. Moreover, picking partial functions 9ij € Qi- for each 
j J and ij G Ij is equivalent to picking partial functions 9i € @i for each i G I, and, in 
this case, {9i\i G 1} = Ujl^i^. | ij G Ij} \ j G J}. By 5. in Proposition [3] we then infer that 
\-i{6i I z G /} exists if and only if U{u{0j^. | ij G Ij} [ j G J} exists, and if both exist then 
U{0j I I G /} = U{U{6'ij. I ij G Ij} \ j G J}; if both exist then L\{6i^ \ ij G Ij} also exists for 
each j G J (because {Oi- \ ij G Ij} C {0, \ i ^ 1} = [J{{9i^ \ ij G Ij} \ j G J}). Therefore, 
we can conclude that U{U{0jj. | ij G Ij} | j G J} equals {U{0j | i G /} | 0j G for each i G 
/, such that U {0j I i G /} exists}, which is nothing but U{0j | i G /}. □ 

Corollary 1. The following hold: 

(1) {_L} U = (already proved as 5. in Proposition [4]); 

(2) eiuG2 = e2uei; 

(3) 91 u (92 u 63) = (9iU92)u93; 

Proof. These follow from Proposition [5] for various index sets / and partitions of it: for 1. 
take / = {1} and its partition / = U /, take 9i = 9, and then use 1. in Proposition [J] 
saying that U0 = {!}; for 2. take partitions {1} U {2} and {2} U {1} of / = {1, 2}, getting 
9i U 92 = 92 U 9i = U{9i I i G {1,2}}; finally, for 3. take partitions {1} U {2,3} and 
{1,2}U{3} of / = {1,2,3}, getting 9iU(92U93) = (9iU92)u93 = U{9, | i G {l,2,3}}.n 

3.3. Least Upper Bound Closures. We next define lub closures of sets of partial maps, 
a crucial operation for the algorithms discussed next in the paper. 

Definition 5. 9 C [X^l/] is lub closed if and only if U9' G 9 for any 9' C 9 admitting 
upper bounds. 

Proposition 6. {_L} and {±,9} are lub closed {9 G [X^V]). 

Proof. It follows easily from Definition [5l using the facts that U0 = _L (i. in Proposition [3]), 
U{9} = 9 {2. in Proposition [3]), and U{L,9} = 9 [3. in Proposition [3] for 9 = {9}). □ 

Proposition 7. If 9 C [X ^ V] and {Qi}i<^i is a family of sets of partial functions in 
[X^V], then: 

(1) If 9 is lub closed then _L G 9; in particular, is not lub closed; 

(2) If 9 has upper bounds and is lub closed then it has a maximum; 

(3) 9 is lub closed iff U{9 | i G /} = 9 for any /; 

(4) If 9 is lub closed and 9i C 9 for each i G / then U{9i | i G /} C 9; 

(5) If Qi is lub closed for each z G / then U{9i | i G /} is lub closed and 

U{9i I i G /} C u{9i I i G /}; 

(6) If / finite and 9j finite for all z G /, then U{9j | i G /} finite; 

(7) If 9j lub closed for alH G / then n{9j [ i G /} is lub closed; 

(8) n{9' I 9' C [X ^ y] with 9 C 9' and 9' is lub closed} is the smallest lub closed set 
including 9. 

Proof. 1. follows taking 9' = in Definition [5] and using U0 = ± [1. in Proposition [3|). 

2. follows taking 9' = 9 in Definition [5) U9 G 9, so max 9 exists (and equals U9). 

3. Definition m implies that U{9 [ i G /} equals {l^{9i | i G /} | 6*^ G 9 for each i G 
/, such that U {9i | i G /} exists}, which is nothing but {U9' [ 9' C 9 such that U 
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G' exists}; the later can be now shown equal to by double inclusion: {U0' | G' C 
G such that U G' exists} C G because is lub closed, and C {U0' | 0' C such that U 
G' exists} because one can pick 0' = {9} for each E and use the fact that \-i{9} = 9 (2. 
in Proposition [3]) . 

4- Let 9 be an arbitrary partial function in U{0j | i E /}, that is, 9 = \J{9i | i G /} for 
some 9i e 0j, one for each i G I, such that {9i \ i G 1} has upper bounds. Since is lub 
closed and 0j C for each i G /, it follows that E G. Therefore, U{0j | z G /} C 0. 

5. Let 0' be a set of partial functions included in U{0j | i G /} which admits an 
upper bound; moreover, for each 9' G 0', let us fix a set {Of | i G /} such that 9^1' G Qi 
for each i G / and 6' = U{6'f' | i G /} (such sets exist because 6*' G 0' C u{0i [ i G /}). 
Let 0^' be the set {Of \ i e 1} for each 9' G 0', let 0^ be the set {Of | 9' G 0'} 
for each i ^ I, and let be the set {9^ \ 6' G 0',i G }. It is easy to see that 
= u{0^' I 9' G 0'} = U{0. I « G /} and that 0- C 0^ for each i G /. Since U0' exists 
(because 0' has upper bounds) and U0' = U{9' \ 9' G 0'} = U{U0^' [ 6' G 0'}, by 5. 
in Proposition [3] it follows that U0 exists and U0' = U0. Since = U{0' | i G /} and 
U0 exists, by 5. in Proposition [3] again we get that U{U0^ | i G /} exists and is equal to 
U0, which is equal to U0'. Since 0^ C 0j and 0^ is lub closed, we get that U0^ G 0j. 
That means that U{U0^ [ f G /} G U{0i | z G /}, that is, that U0' G U{0i | z G /}. 
Since 0' C U{0j | i G /} was chosen arbitrarily, we conclude that U{0j [ i G /} is lub 
closed. To show that U{0i | i G /} C U{0j | i G /}, let us pick an i G / and let us 
partition / as {i} U (/\{i}). By Proposition [5l U{0j | z G /} = 0j U (u{0j | j G A{^}})- 
The proof above also implies that U{0j | j G /\{«}} is lub closed, so by 1. we get that 
_L G U{Qj I j G I\{i}}. Finally, 6. in Proposition [1 implies 0^ C 0^ U (u{0j | j G A{^}})) 
so ©i C U{0i I i G /} for each i G /, that is, U{0i | i G /} C u{0i 1 « G /}. 

6. Recall from Definition U] that U{0j | i G /} contains the existing least upper bounds 
of sets of partial functions containing precisely one partial function in each 0j. If / and 
each of the 0^ for each i G / is finite, then | U {0j | i G /}| < JliG/ because there 
at most Y\.i&i combinations of partial functions, one in each 0j, that admit an upper 
bound. Therefore, U{0j | i G /} is also finite. 

7. Let 0' C n{0i I i G /} be a set of partial functions admitting an upper bound. Then 
0' ^ 0i for each z G / and, since each 0j is lub closed, U0' G Qi for each i G /. Therefore, 
U0' G n{0i h G /}. _ 

8. Anticipating the definition of and notation for lub closures (Definition [5]) , we let 
denote the set n{0' | 0' C [X ^ F] with C 0' and 0' is lub closed}. It is clear that 
C and, by 7., that is lub closed. It is also the smallest lub closed set including 0, 
because all such sets 0' are among those whose intersection defines 0. □ 

Definition 6. Given 9' e[X^V] and C [X^y], let 

{9']q '^^ {9\9(^Q and 9 □ 9'} 
be the set of partial functions in that are less informative than 9' . 

Proposition 8. If 6*, 6*', 6*", 6*1, 6*2 e[X^V] and if C [X^F] is lub closed, then: 

(1) (^']e is lub closed and max(0']0 exists; 

(2) If 9' G {(9} U then {6" | 6*" G and 9' = 9 VA 9"} has maximum and that equals 
max (^']e; 

(3) If 6*1, 6*2 G {6*} U such that 9i = max (6'2]e, then 9i = 92- 
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Proof. 1. First, note that 9' is an upper bound for (^']e as well as for any subset Q' of it, so 
any Q' C {6']q has upper bounds, so by 2. in Proposition [2l U0' exists for any B' C {9']q. 
Moreover, if 6' C {9% then U6' C 9', and since 6 is lub closed it follows that UG' G 6, so 
U0' G (^']0. Therefore, (^']e is lub closed. 2. in Proposition [7] now implies that (^']e has 
maximum; to be concrete, max(9']Q is nothing but U {9']q, which belongs to {9']q (because 
one can pick 0' = (^']e above). 

2. Let Q be the set of partial functions {9" \ 9" € Q and 9' = 9 U 9"}. Note that Q 
is non-empty (because 9' G {9} U 0, so there is some 9" G such as 9' = 9 U 9") and has 
upper-bounds (because 9' is an upper bound for it), but that it is not necessarily lub closed 
(because, unless 9' = 9, Q does not contain ±, contradicting 1. in Proposition [7]) . Hence 
Q has a lub (by 2. in Proposition [2]) , say q, and q = UQ Q 9'; since 9 C 9', it follows that 
9 Li q ^ 9'. On the other hand 9' Q 9 U q hy 4- in Proposition [3l because there is some 
9" G Q such that 9' = 9 U 9" and 9" □ q. Therefore, 9' = 9 q. Since is lub closed, it 
follows that g G 0. Therefore, g G Q, so g is the maximum element of Q. Let us next show 
that q = max(^']e. The relation q C max(^']e is immediate because q G {9']q, (we proved 
above that g G and q C 9'). For max {9']q C (? it suffices to show that max {9']q G Q, that 
is, that 9 U max(6'']e = 9': 9U max((9']e E 9' follows because 6* C 6*' and max(6'']G E 9', 
while 9' ^ 9 U max(^']e follows because there is some 9" G such that 9' = 9 U 9" and, 
since 9" C max (6'']e, 9u9" ^9U max ((9']0 (by 4. in Proposition [3]). 

5. admits a direct proof simpler than that of 2.] however, since 2. is needed anyway, 
we prefer to use 2. Note that 6* □ 6*1 □ 6*2. By 6*1 = max {9" | 6*" G and 92 = 9 U 9"}, 
which implies 92 = 9u9i=9i. □ 

Definition 7. Given C [X ^ y], we let 0, the least upper bound (lub) closure of 

0, be defined as follows: 

n{0' I 0' C [X^y] with C 0' and 0' is lub closed}. 

Proposition 9. The next hold (0 C [X^F], 61 G [X^F]): 

(1) is the smallest lub closed set including 0; 

(2) = {I} = {±}; 

(3) w = {±,n- 

Proof. 1. follows by 7. in Proposition [71 For 2. and 5., first note that {_L} and {-L,0} are 
lub closed by Proposition [6l second, note that they are indeed the smallest lub closed sets 
including _L and resp. 9, as any lub closed set must include _L {1. in Proposition [7]). □ 

Proposition 10. The lub closure map ~ : 2[^"^^1 ig a closure operator, that is, 

for any 0,0i,02 C {X^V\, 

(1) (extensivity) C 0; 

(2) (monotonicity) If ©i C 02 then C 0^; 

(3) (idempotency) = 0. 

Proof. Extensivity and idempotency follow immediately from the definitions of and 
(which are lub closed by in Proposition [9|). For monotonicity, one should note that ©2 
satisfies the properties of ©1 (i.e., ©1 C ©2 and ©2 is lub closed); since ©1 is the smallest 
with those properties, it follows that ©1 C ©2. □ 
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Proposition 11. U{0j | i G /} = U{©i \ i £ 1} for any family {Bj}jg/ of partial functions 
in [X^V]. 

Proof. Since 0, is lub closed for any i I, 5. in Proposition [7] implies that U{0j | z G /} is 
lub closed and U{@i | i € /} C U{0j | z G /}. Since 1. in Proposition [TO] implies 0j C 0j 
for each i € / and since U{0j | z G /} is the smallest lub closed set including U{0j \ i € 1} 
[1. in Proposition [9|), the inclusion U{0j | z G /} C U{0j | i G /} holds. Conversely, 
2. in Proposition [TH] implies that 0^ C U{0j j i G /} for any z G /, so U{0i | i G /} C 
U{0j I i G /} holds by 4- in Proposition [71 □ 

Corollary 2. For any 6^ G [X^F] and C [X^F], equality {e}VJQ = {_L, 6^} U0 holds. 

Proo/. {e}\jQ = Je}U@hy Proposition [lU and = {_L, 6^} by 3. in Proposition □ 

Corollary 3. If C [X^F] is finite then is also finite. 

Proof. Suppose that = {^i, 62, ■ ■ ■ , On} for some n > 0. Iteratively applying Corollary [2l 
= {_L,6'i} U {±,62} U • • • {_L,6'„}; in obtaining that, we used 2. in Proposition [9] and 1. 
in Corollary [TJ The result follows now by 6. in Proposition [71 □ 



4. Parametric Traces and Properties 

Here we introduce the notions of event, trace and property, first non-parametric and then 
parametric. Traces are sequences of events. Parametric events can carry data-values as 
instances of parameters. Parametric traces are traces over parametric events. Properties are 
trace classifiers, that is, mappings partitioning the space of traces into categories (violating 
traces, validating traces, don't know traces, etc.). Parametric properties are parametric 
trace classifiers and provide, for each parameter instance, the category to which the trace 
slice corresponding to that parameter instance belongs. Trace slicing is defined as a reduct 
operation that forgets all the events that are unrelated to the given parameter instance. 

4.1. The Non-Parametric Case. We next introduce non-parametric events, traces and 
properties, which will serve as an infrastructure for their parametric variants in Section [4.21 

Definition 8. Let £" be a set of (non-parametric) events, called base events or simply 
events. An £'-trace, or simply a (non-parametric) trace when £ is understood or not 
important, is any finite sequence of events in £, that is, an element in £* . If event e G £" 
appears in trace w G £* then we write e € w. 

Our parametric trace slicing and monitoring techniques in Sections [6l and El can be 
easily adapted to also work with infinite traces. Since infinite versus finite traces is not an 
important aspect of the work reported here, we keep the presentation free of unnecessary 
technical complications and consider only finite traces. 

Example, (part 1 of simple running example) Like in Section 12. 1[ consider a certain 
resource (e.g., a synchronization object) that can be acquired and released during the 
lifetime of a given procedure (between its begin and end). Then the set of events £ is 
{acquire, release, begin, end} and execution traces corresponding to this resource are sequences 
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begin acquire acquire release end begin end, begin acquire acquire, begin acquire release acquire end, 
etc. For the time being, there are no "good" or "bad" execution traces. □ 

There is a plethora of formahsms to specify trace requirements. Many of these result in 
specifying at least two types of traces: those validating the specification (i.e, correct traces), 
and those violating the specification (i.e., incorrect traces). 

Example, (part 2) Consider a regular expression specification, 

(begin(e | (acquire(acquire | release)*release))end)* 

stating that the procedure can (non-recursively) take place multiple times and, if the re- 
source is acquired during the procedure then it is released by the end of the procedure. 
Assume that the resource can be acquired and released multiple times, with the effect of ac- 
quiring and respectively releasing it precisely once. The validating (or matching) traces for 
this property are those satisfying the pattern, e.g., begin acquire acquire release end begin end. 
At first sight, one may say that all the other traces are violating (or failing) traces, because 
they are not in the language of the regular expression. However, there are two interesting 
types of such "failing" traces: ones which may still lead to a matching trace provided the 
right events will be received in the future, e.g., begin acquire acquire, and ones which have no 
chance of becoming a matching trace, e.g., begin acquire release acquire end. □ 

In general, traces are not enforced to correspond to terminated programs (this is par- 
ticularly useful in monitoring); if one wants to enforce traces to correspond to terminated 
programs, then one can have the system generate a special end-of-trace event and have the 
property specification require that event at the end of each trace. 

Therefore, a trace property may partition the space of traces into more than two cate- 
gories. For some specification formalisms, for example ones based on fuzzy logics or multiple 
truth values, the set of traces may be split into more than three categories, even into a con- 
tinuous space of categories. Section 12.61 showed an example of a property where the space 
of trace categories was the set of rational numbers between and 1. 

Definition 9. An £^-property P, or simply a (base or non-parametric) property, is a 
function P : £* ^ C partitioning the set of traces into (verdict) categories C. 

It is common, though not enforced, that the set of property verdict categories C in Def- 
inition [9] includes validating (or similar), violating (or similar), and don't know (or ?) categories. 
In general, C can be any set, finite or infinite. 

We believe that the definition of non-parametric trace property above is general enough 
that it can easily accommodate any particular specification formalism, such as ones based 
on linear temporal logics, regular expressions, context-free grammars, etc. All one needs to 
do in order to instantiate the general results in this paper for a particular specification for- 
malism is to decide upon the desired categories in which traces are intended to be classified, 
and then define the property associated to a specification accordingly. 

For example, if the specification formalism of choice is that of regular expressions over 
£ and one is interested in classifying traces in three categories as in our example above, 
then one can pick C to be the set {match, fail, don't know} and, for a given regular expression 
E, define its associated property Pe : <?* — > C as follows: Pe{w) = match iff w is in the 
language of E, Pe{w) = fail iff there is no w' € £* such that ww' is in the language of E, 
and Pe{w) = don't know otherwise; this is the monitoring semantics of regular expressions in 
the JavaMOP and RV systems [281 [Til [27]. 
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Other semantic choices are possible even for the simple case of regular expressions; 
for example, one may choose C to be the set {match, don't care} and define Pe{w) = rnatch 
iff w is in the language of E, and Pe{w) = don't care otherwise; this is the semantics of 
regular expressions in Tracematches [Ij, where, depending upon how one writes the regular 
expression, matching can mean either a violation or a validation of the desired property. 

Similarly, one can have various verdict categories for linear temporal logic (LTL). For 
example, one can report violation when a bad prefix is encountered, can report validation when 
a good prefix is encountered, and can report inconclusive when neither of the above; this is the 
current monitoring semantics of monitoring LTL in JavaMOP and RV |281 127j. Semantical 
and algorithmic aspects regarding LTL monitoring can be found, e.g., in [2 H W2 \ [33 } [7]. 

In some applications, one may not be interested in certain categories of traces, such 
as in those classified as don't know, don't care, or inconclusive; if that is the case, then those 
applications can simply ignore these, like Tracematches, JavaMOP and RV do. It may be 
worth making it explicit that in this paper we do not attempt to propose or promote any 
particular formalism for specifying properties about execution traces. Instead, our approach 
is to define properties as generally as possible to capture the various specification formalisms 
that we are aware of as special cases, and then to develop our subsequent techniques to work 
with such general properties. We believe that our definition of property in Definition [9] is 
general enough to allow us to claim that our results are specification- formalism-independent. 

An additional benefit of defining properties so generally, as mappings from traces to 
categories, is that parametric properties, in spite of their much more general fiavor, are also 
properties (but, obviously, over different traces and over different categories). 

4.2. The Parametric Case. Events often carry concrete data instantiating parameters. 

Example, (part 3) In our running example, events acquire and release are parametric in the 
resource being acquired or released; if r is the name of the generic resource parameter and 
ri and r2 are two concrete resources, then parametric acquire/release events have the form 
acquire(r i-)- ri), release(r i->- r2), etc. Not all events need carry instances for all parameters; 
e.g., the begin/end parametric events have the form begin(±) and end(_L), where _L, the 
partial map undefined everywhere, instantiates no parameter. □ 

Recall from Definition [1] that [A — )• B] and [A —r B\ denote the sets of total and, 
respectively, partial functions from AtoB. 

Definition 10. (Parametric events and traces). Let X be a set of parameters and 

let y be a set of corresponding parameter values. If if is a set of base events like 
in Definition [HJ then let £{X) denote the set of corresponding parametric events e(0), 
where e is a base event in E and is a partial function in [X ^ y] . A parametric trace 
is a trace with events in £{X), that is, a word in £{X)* . 

Therefore, a parametric event is an event carrying values for zero, one, several or even 
all the parameters, and a parametric trace is a finite sequence of parametric events. In 
practice, the number of values carried by an event is finite; however, we do not need to 
enforce this restriction in our theoretical developments. Also, in practice the parameters 
may be typed, in which case the set of their corresponding values is given by their type. To 
simplify writing, we occasionally assume the set of parameter values V implicit. 

Example, (part 4) parametric trace for our running example can be the following: 
begin(i_) acquire(0i) acquire(6'2) acquire(0i) release(0i) end(J_) begin(i_) acquire(6'2) release(6'2) end(_L), 
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where 9i maps r to ri and 62 maps r to r2. To simplify writing, we take the freedom to 
only list the parameter instance values when writing parameter instances, that is, (ri) 
instead of (r 1-^ ri), or rfj-j instead of Tfr^r2) 6tc. This notation is formally intro- 
duced in the next section, as Notation [TJ With this notation, the above trace becomes 
begin() acquire(ri) acquire(r2) acquire(ri) release(ri) end() begin() acquire(r2) release(r2) end(). 
This trace involves two resources, ri and r2, and it really consists of two trace slices, one 
for each resource, merged together. The begin and end events belong to both trace slices. 
The slice corresponding to 61 is begin acquire acquire release end begin end, while the one for 
02 is begin acquire end begin acquire release end. □ 

Recall from Definition [T] that two partial maps of same source and target are compatible 
when they do not disagree on any of the elements on which they are both defined, and that 
one is less informative than another, written 9 C 9', when they are compatible and the 
domain of the former in included in the domain of the latter. With the notation in the 
example above we have, for our running example, that () is compatible with (ri) and with 
(r2), but (ri) and (r2) are not compatible; moreover, () C (ri) and () Q (r2). 

Definition 11. (Trace slicing) Given parametric trace r € 8{X)* and partial function 9 
in [X^y], we let the 0-trace slice rfg € 8* be the non-parametric trace in £* defined as: 
• e\Q= e, where e is the empty trace/word, and 



Therefore, the trace slice T\g first filters out all the parametric events that are not relevant 
for the instance 9, i.e., which contain instances of parameters that 9 does not care about, 
and then, for the remaining events relevant to 9, it forgets the parameters so that the trace 
can be checked against base, non-parametric properties. It is crucial to discard parameter 
instances that are not relevant to 9 during the slicing, including those more informative 
than 9, in order to achieve a correct slice for 9: in our running example, the trace slice for 
should contain only begin and end events and no acquire or release. Otherwise, the acquire 
and release of different resources will interfere with each other in the trace slice for (). 

One should not confuse extracting/abstracting traces from executions with slicing traces. 
The former determines the events to include in the trace, as well as parameter instances 
carried by events, while the latter dispatches each event in the given trace to correspond- 
ing trace slices according to the event's parameter instance. Different abstractions may 
result in different parametric traces from the same execution and thus may lead to different 
trace slices for the same parameter instance 9. For the (map, collection, iterator) example 
in Section [H X = {m, c, i} and an execution may generate the following parametric trace: 
createColl(mi, ci) createlter(ci, ii) next(zi) updateMap(mi). The trace slice for (mi) is updatelVlap 
for this parametric trace. Now suppose that we are only interested in operations on maps. 
Then X = {m} and the trace abstracted from the execution generating the above trace is 
createColl(mi) updateMap(mi), in which events and parameter bindings irrelevant to m are 
removed. Then the trace slice for (mi) is createColl updateMap. In this paper we focus only 
on trace slicing; more discussion about trace abstraction can be found in |15] . 

Specifying properties over parametric traces is rather challenging, because one may 
want to specify a property for one generic parameter instance and then say "and so on for 
all the other instances". In other words, one may want to specify a sort of a universally 
quantified property over base events, but, unfortunately, the underlying specification for- 
malism may not allow such quantification over data; for example, none of the conventional 
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formalisms to specify properties on linear traces listed above (i.e, linear temporal logics, 
regular expressions, context-free grammars) or mentioned in the rest of the paper has data 
quantification by default. We say "by default" because, in some cases, there are attempts 
to add data quantification; for example, [23] investigates the implications and the necessary 
restrictions resulting from adding quantifiers to LTL, and |33] investigates a finite-trace 
variant of parametric LTL together with a translation to parametric alternating automata. 

Definition 12. Let X be a set of parameters together with their corresponding parameter 
values V, like in Definition [TOl and let P : £"* ^ C be a non-parametric property like in 
Definition [9l Then we define the parametric property AX . P as the property (over traces 
£{X)* and categories [[X^F] ^ C]) 

AX.P : £{X)* ^[[X^Vj^C] 

defined as 

iAX.P){r){e)=P{T\g) 
for any r € £{X)* and any G [X^T/]. li X = we may write Axi, • -P 

instead of {A{xi, ...,Xn} - P- Also, if P^p is defined using a pattern or formula 93 in some par- 
ticular trace specification formalism, we take the liberty to write AX . ip instead of AX . P^. 

Parametric properties AX . P over base properties P : £* ^ C are therefore properties 
taking traces in £{X)* to categories [[X^y] —>C], i.e., to function domains from parameter 
instances to base property categories. AX . P is defined as if many instances of P are 
observed at the same time on the parametric trace, one property instance for each parameter 
instance, each property instance concerned with its events only, dropping the unrelated ones. 

Example, (part 5) Let P be the non-parametric property specified by the regular ex- 
pression in the second part of our running example above (using the mapping of regular 
expressions to properties discussed in the second part of our running example and after Def- 
inition [9]- i.e., the JavaMOP/RV semantic approach to parametric monitoring [H]). Since 
we want P to hold for any resource instance, we define the following parametric property: 

Ar . (begin (e | (acquire (acquire | release)* release)) end)*. 

If T is the parametric trace and Oi and O2 are the parameter instances in the fourth part 
of our running example, then the semantics of the parametric property above on trace r is 
validating for parameter instance 9i and violating for parameter instance ^2- D 

5. Slicing With Less 

Consider a parametric trace r in £{X)* and a parameter instance 9. Since there is no 
apriori correlation between the parameters being instantiated by 6 and those by the various 
parametric events in r, it may very well be the case that 6 contains parameter instances 
that never appear in r. In this section we show that slicing r by ^ is the same as slicing 
it by a "smaller" parameter instance than 0, namely one containing only those parameters 
instantiated by 6 that also appear as instances of some parameters in some events in r. 
Formally, this smaller parameter instance is the largest partial map smaller than 6 in the 
lub closure of all the parameter instances of events in r; this partial function is proved to 
indeed exist. We first formalize a notation used informally so far in this paper: 
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Notation 1. When the domain of 9 is finite, which is always the case in our examples 
in this paper and probably will also be the case in most practical uses of our trace slicing 
algorithm, and when the corresponding parameter names are clear from context, we take 
the liberty to write partial functions compactly by simply listing their parameter values; 
for example, we write a partial function with 9 {a) = 02, 9{h) = bi and 9{c) = ci as the 
sequence "a2^ici". The function _L then corresponds to the empty sequence. 

Example. Here is a parametric trace with events parametric in {a, b, c}, where the param- 
eters take values in the set {ai, 02, 61, ci}: 

r = ei(ai) 62(02) 63(61) 64(0261) 65(01) eeO 67(61) es{ci) 69(0261) eio(oi6iei) 6ii(). 

It may be the case that some of the base events appearing in a trace are the same; for 
example, 61 may be equal to 62 and to 65. It is actually frequently the case in practice (at 
least in PQL [25|, Tracematches [1], JavaMOP [14j and RV [2^) that parametric events 
are specified apriori with a given (sub)set of parameters, so that each event in £ is always 
instantiated with partial functions over the same domain, i.e., if e{9) and e{9') appear in a 
parametric trace, then Dom(^) = Dom(0'). While this restriction is reasonable and sometimes 
useful, our trace slicing and monitoring algorithms in this paper do not need it. □ 

Recall from Definition [11] that the trace slice r \g keeps from r only those events that 
are relevant for 9 and drops their parameters. 

Example. Consider again the sample parametric trace above with events parametric in 
{o,6,e}: r = ei(ai) 62(02) 63(61) 64(0261) 65(01) ee{) 67(61) 63(61) 69(0261) 610(016161) 6ii(). 
Several slices of r are listed below: 



r\ai 


= 616566611 


r\a2 


= 6266611 


T\aibi 


= 6163656667611 




= 6263646667611 


r\ 


= 66611 


T\aibici 


= 616365666768610611 


'^\a2bici 


= 62636466676869611 


'^\aib2Ci 


= 61656668611 


'''\b2c2 


= 66611 



In order for the partial functions above to make sense, we assumed that the set V in which 
parameters X = {a, 6, e} take values includes {oi, 02, 61, ei}. □ 

Definition 13. Given parametric trace r G £{X)*, we let 0,- denote the lub closure of all 
the parameter instances appearing in events in r, that is, Qr = {9 \ 9 ^ [X ^V], e{9) G r}. 

Proposition 12. For any parametric trace r G £{X)* , the set 0,- is a finite lub closed set. 

Proof. Qr is already defined as a lub closed set; since r is finite. Corollary [3] implies that 
Br is finite. □ 

Proposition 13. Given Te{9) G £{X)*, the following equality holds: Qre{e) = {-i-,9}uQr. 
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Proof. It follows by the following sequence of equalities: 

Qreie) = {e'\e'€[X^V], e'je') ereje)} 
= {9} U {6' I 6' £ [X^V], e'{9') £ r} 
= {0} U Qr 

= {±,0}ue; 
= {±,e}uer. 

The first equality follows by Definition [13l the second by separating the case e'{6') = e{9), 
the third again by Definition [131 the fourth by Corollary [21 and the fifth by Proposition 1121 
Therefore, Qre{e) is the smallest lub closed set that contains 6 and includes Qr- D 

Proposition 14. Given r £ S{X)* and 9 £ [X ^V], the equality T\g= r|'jnax(e]e^ holds. 

Proof. We prove the following more general result: 

"Let ec[X^V]he lub closed and let 9 £ [X ^V]; 
then T\g= T\yaax{e]e fo'^ ™y ^ <5(X)* with 0^ C 9." 
First note that the statement above is well- formed because max(^]0 exists whenever Q is 
lub closed (i. in Proposition [8]) , and that it is indeed more general than the stated result: 
for the given r £ £{X)* and 9 £ [X ^V], we pick to be Qr- We prove the general result 
by induction on the length of r: 

- If |r| = then r = e and efe= erinax(0]©= £• 

- Now suppose that T\g= Tfinax(e]e ™y ^ £{X)* with ©t- C © and |r| = n > 0, 
and let us show that T'\g= T'l^^^^^^gj^ for any t' £ £{X)* with ©,-/ C and |r'| = n + 1. 
Pick such a r' and let r' = t e{9') for a r £ £{X)* with |r| = n and an e(0') £ £{X). Since 
©r' ^ 0) by 6'. in Proposition[3|and by Proposition[T3]it follows that ©r C {±, ^'ju©,- C ©, 
so the induction hypothesis implies T\g= Tfjnax(e]e- '^^^ ^^^t follows noticing that 9' Q 9 iS 
9' Q max(0]0, which is a consequence of the definition of max(0]0 because 9' £ {^,9'} C 
{-L,9'} U ©T- C (again by 6. in Proposition [4] and by Proposition I13p . 

Alternatively, one could have also done the proof above by induction on r, not on its 
length, but the proof would be more involved, because one would need to prove that the 
domain over which the property is universally quantified, namely "any r £ £{X)* with 
Qr C 0" is inductively generated. We therefore preferred to choose a more elementary 
induction schema. □ 

6. Algorithm for Online Parametric Trace Slicing 

Definition 1111 illustrates a way to slice a parametric trace for given parameter bindings. How- 
ever, it is not suitable for online trace slicing, where the trace is observed incrementally and 
no future knowledge is available, because we cannot know all possible parameter instances 
9 apriori. We next define an algorithm A{X) that takes a parametric trace r £ £{X)* 
incrementally (i.e., event by event), and builds a partial function T £ [[X^y] £*] of 
finite domain that serves as a quick lookup table for all slices of r. More precisely. Theorem 
[Tjshows that, for any 9 £ [X^l/], the trace slice t\q is T(max(6']0) after A{X) processes r, 
where = ©,- is the domain of T, a finite lub closed set of partial functions also calculated 
by A(X) incrementally (see Definition [13] for Qr). Therefore, assuming that A{X) is run 
on trace r, all one has to do in order to calculate a slice r \g for a given 9 £ [X ^ V] 
is to calculate max(^]0 followed by a lookup into T. This way the trace r, which can 
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Algorithm A{X) 

Input: parametric trace r S £{X)* 

Output: map T G [[X^V]^8*] and set G C [X^F] 

1 T^±; T(±) ^e; G ^ {±} 

2 ^ox each parametric event e{9) in order (first to last) in r do 

3 ; foreach 9' G {6*} U Q do 

4 ; ; T{e') ^T(max(0']©)e 

5 : endfor 

6 ; G ^ {±,6'}UQ 

7 endfor 

Figure 2: Parametric trace slicing algorithm A{X). 

be very long, is processed/traversed only once, as it is being generated, and appropriate 
data-structures are maintained by our algorithm that allow for retrieval of slices for any 
parameter instance 6, without having to traverse the trace r again, as an algorithm blindly 
following the definition of trace slicing (Definition [TT]) would do. 

Figure [2] shows our trace slicing algorithm A{X). In spite of A(X)'s small size, its proof 
of correctness is surprisingly intricate, making use of almost all the mathematical machinery 
developed so far in the paper. The algorithm A{X) on input r, written more succinctly 
A{X){t), traverses r from its first event to its last event and, for each encountered event e{6), 
updates both its data-structures, T and G. After processing each event, the relationship 
between T and G is that the latter is the domain of the former. Line 1 initializes the data- 
structures: T is undefined everywhere (i.e., _L) except for the undefined-everywhere function 
±, where T(_L) = e; as expected, Q is then initialized to the set {-L}. The code (lines 3 to 6) 
inside the outer loop (lines 2 to 7) can be triggered when a new event is received, as in most 
online runtime verification systems. When a new event is received, say e{9), the mapping 
T is updated as follows: for each 0' G [X ^ V] that can be obtained by combining 9 with 
the compatible partial functions in the domain of the current T, update T{9') by adding 
the non-parametric event e to the end of the slice corresponding to the largest (i.e., most 
"knowledgeable") entry in the current table T that is less informative or as informative as 
9'; the Q data-structure is then extended in line 6 (see Proposition 113! for why this way). 

Example. Consider again the sample trace in Section [5] with events parametric in {a, b, c}, 
namely r = ei(ai) 62(02) 63(61) 64(0261) 65(01) eeQ 67(61) 63(01) 69(0261) 610(016161) 6ii(). 
Table[T]shows how A{X) works on r. An entry of the form {9) -.w ina table cell corresponding 
to a current parametric event e{9) means that T{9) = w after processing all the parametric 
events up to and including the current one; T is undefined on any other partial function. 
Obviously, the G corresponding to a cell is the union of all the 0's that appear in pairs {9) -.w 
in that cell. Note that, as each parametric event e{9) is processed, the non-parametric event 
6 is added at most once to each slice, and that G stays lub closed. □ 

A{X) computes trace slices for all combinations of parameter instances observed in 
parametric trace events. Its complexity is therefore 0(n x m) where n is the length of the 
trace and m is the number of all possible parameter combinations. However, A{X) is not 
intended to be implemented directly; it is only used as a correctness backbone for other trace 
analysis algorithms, such as the monitoring algorithms discussed below. An alternative and 
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Table 1: A run of the trace slicing algorithm A{X) (top-left table first, followed by bottom- 
left table, followed by the right table). 



apparently more efficient solution is to only record trace slices for parameter instances that 
actually appear in the trace (instead of for all combinations of them) , and then construct the 
slice for a given parameter instance by combining such trace slices for compatible parameter 
instances. However, the complexity of constructing all possible trace slices at the end using 
such an algorithm is also 0{n x m), so it would not bring any benefit overall compared to 
A{X). In addition, A{X) is more suitable as a backbone for developing online monitoring 
algorithms such as those in Section \7\ because each event is sent to its slices (that will be 
consumed by corresponding monitors) and never touched again. 

A{X) compactly and uniformly captures several special cases and subcases that are 
worth discussing. The discussion below can be formalized as an inductive (on the length 
of r) proof of correctness for A{X), but we prefer to keep this discussion informal and give 
a rigorous proof shortly after. The role of this discussion is twofold: (1) to better explain 
the algorithm A{X), providing the reader with additional intuition for its difficulty and 
compactness, and (2) to give a proof sketch for the correctness of A{X). 

Let us first note that a partial function added to Q will never be removed from Q; 
that's because G C {-L,0} U 0. The same holds true for the domain of T, because line 4 
can only add new elements to Dom(T); in fact, the domain of T is extended with precisely 
the set {6} U after each event parametric in 9 is processed by A{X). Moreover, since 
Dom(T) = = = {-L} initially and since 5. and 7. in PropositionlHimply 0U({0}U0) = 
{_L,0} U while Proposition [13] states that Qreie) = {-Lj^} U ©t) we can inductively show 
that Dom(T) = = 0t- each time after A{X) is executed on a parametric trace r. 

Each 6' considered by the loop at lines 3-5 has the property that Q 6' , and at (pre- 
cisely) one iteration of the loop 9' is 0; indeed, 6 G {6} U because _L G 0. Thanks to 
Proposition 1141 Theorem [T] holds essentially iff T{6') = r\g' after T{9') is updated in line 4. 
A tricky observation which is crucial for this is that 3. in Proposition [8] implies that the 
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updates of T{9') do not interfere with each other for different 9' € {0} U Q; otherwise the 
non-parametric event e may wrongly be added multiple times to some trace slices T(^'). 

Let us next informally argue, inductively, that it is indeed the case that T{6') = t\qi after 
T(^') is updated in line 4 (it vacuously holds on the empty trace). Since max {0']q € Q, the 
inductive hypothesis tells us that T(max(0']e) = tI^^^^^q/^^; these are further equal to rfg' 
by Proposition[lll Since 9 Q 9', the definition of trace slicing implies that (r e{9))\g'= T\gi e. 
Therefore, T{9') is indeed {Te{9))\g' after line 4 of A{X) is executed while processing the 
event e{9) that follows trace r. This concludes our informal proof sketch; let us next give 
a rigorous proof of correctness for our trace slicing algorithm A{X). 

Definition 14. Let A{X)(t).T and A{X){t).& be the two data-structures (T and 0) main- 
tained by the algorithm A{X) in Figure [2] after it processes r. 

Theorem 1. With the notation in Definition 1141 the following hold for any r G £{X)*: 

(1) Dom(A(X)(T).T) = A(X)(r).e = 9^; 

(2) A{X){t).T{9) = t\b for any 9 G A{X){t).@; 

(3) T\e= A(X)(T).T(max(0]A(x>(r).e) for any 9e[X^V]. 

Proof. Since A{X) processes the events in the input trace in order, when given the input 
Te{9), the Q and T structures after A{X) processes r but before it processes e{9) (i.e., right 
before the last iteration of the loop at lines 2-7) are precisely A{X){t).Q and A{X){t).T, 
respectively. Further, the loop at lines 3-5 updates T on all 9' € {9} U 0; in case T was 
not defined on such a 9', then it will be defined after e{9) is processed. The definitional 
domain of T is thus continuously growing or potentially remains stationary as parametric 
events are processed, but it never decreases. With these observations, we can prove 1. by 
induction on r. If r = e then Donn(A(X)(e).T) = A(X)(e).e = 9^ = {-L}. Suppose now 
that Dom(A(X)(T).T) = A(X)(r).0 = 9^ holds for r G £{X)*, and let e{9) G £{X) be any 
parametric event. Then the following concludes the proof of i.: 

Dom(A(X)(Te(0)).T) = Dom(A(X)(T).T) U ({0} U A(X)(r).9) 

= A{X){t).@U{{9}uA{X){t).Q) 

= ({±} U A(X)(r).9) U ({n U A(X)(t).9) 

= {±,0}uA(X)(t).9 

= A{X){Te{9)).Q 

= {±,9}U@r 

where the first equality follows from how the loop at lines 3-5 updates T, the second by the 
induction hypothesis, the third by 5. in Proposition HI the fourth by 7. in Proposition 21 
the fifth by how 9 is updated at line 6, the sixth again by the induction hypothesis, and, 
finally, the seventh by Proposition 1131 

Before we continue, let us first prove the following property: 

A{X)iTe{9)).T{9') = A(X)(T).T(max (0'U(x>M.e) e 

for any e{9) G £{X) and any 9' G {9} U A(X)(t).9. 
One should be careful here to not get tricked thinking that this property is straightforward, 
because it says only what line 4 of A{X) does. The complexity comes from the fact that 
if there were two different 01,6*2 G {9} U A(X)(r).9 such that 9i = max (02]A(x>{T).e) then 
an unfortunate enumeration of the partial functions 9' in {9} U A(X)(r).9 by the loop at 
lines 3-5 may lead to the non-parametric event e to be added twice to a slice: indeed, if 9i is 
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processed before O2, then e is first added to the end of T(^i) when 9' = 61, and then T(^i) e 
is assigned to ¥(^2) when 6' = 62; this way, T{62) ends up accumulating e twice instead of 
once, which is obviously wrong. Fortunately, since A{X)(t).Q is lub closed (by 1. above and 
Proposition I12p . 3. in Proposition [8] implies that there are no such different ^1,^2 € {&} U 
A{X){t).Q. Therefore, there is no interference between the various assignments at line 4, 
regardless of the order in which the partial functions 9' G {9} U Q are enumerated, which 
means that, indeed, A(X)(r e(6')).T(V) = A(X)(T).T(max (6l']A(x>(r).e) e for any e{e) G 
£{X) and for any 6' € {6} U A{X){t).@. This lack of interference between updates of T 
also suggests an important implementation optimization: 

The loop at lines 3-5 can be parallelized without duplicating the table T! 

Of course, the loop can be parallelized anyway if the table is duplicated and then merged 
within the original table, in the sense that all the writes to T{9') are done in a copy of T. 
However, experiments show that the table T can be literally huge in real applications, in 
the order of billions of entries, so duplicating and merging it can be prohibitive. 

2. can be now proved by induction on the length of r. If r = e then A{X)(e).Q = {-L}, 
so 9' G A{X){e).@ can only be _L; then A(X)(e).T(±) = r \±= e. Suppose now that 
A{X){t).t[9') = T\g, for any 9' G A{X){T).e and let us show that A{X){t e{e)).T{9') = 
iTe{9))\0^ for any 9' e A{X){t e{9)).e. As shown in the proof of i. above, A(X)(r e(e)).G = 
A{X){T).e U {{9} U A{X){t).Q), so we have two cases to analyze. First, if 9' G {9} U 
A{X){t).Q then 9 Q 9' and so {Te{9))\e'= T\g> e; further, 

A{X){Te{9)).T{9') = A(X)(r).T(max (0']A(x>{r).e) e 

= rig, e 

= {re{9))\e>, 

where the first equality follows by the auxiliary property proved above, the second by the 
induction hypothesis using the fact that max{9']^x){T).e ^ A(X)(r).0, and the third by 
Proposition [11 Second, if 9' G A{X){T).e but 9' {9} U A(X)(r).e then 9^9' and so 
{t e{9))\gi= T\gi] furthermore, 

A{X){Te{9)).T{9') = A{X){t).T{9') 

= T\g> 

= {Te{9))\g,, 

where the first equality holds because 9' is not considered by the loop in lines 3-5 in A{X), 
that is, 9' ^ {9} U A(X)(r).0, and the second equality follows by the induction hypothesis, 
as 9' G A(X)(r).e. Therefore, A{X){t e{9)).T{9') = {Te{9))\ei for any 9' G A{X){t e{9)).e, 
which completes the proof of 2. 

3. is the main result concerning our trace slicing algorithm and it follows now easily: 

''"te = ''"rmax(6»]e^ 

= 'T\mla^{e\p,^^x)^T).e 

= A(X)(r).T(max(e]A(x>(r).e) 
The first equality follows by Proposition 1141 the second equality by 1. above, and the third 
equality by 2. above, because max (^]A(x){T).e ^ A(X)(r).G. This concludes the correctness 
proof of our trace slicing algorithm A{X). □ 
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7. Monitors and Parametric Monitors 

In this section we first define monitors M as a variant of Moore macliines witli potentially 
infinitely many states; then we define parametric monitors AX . M as monitors maintaining 
one state of M per parameter instance. Like for parametric properties, which turned out to 
be just properties over parametric traces, we show that parametric monitors are also just 
monitors, but for parametric events and with instance-indexed states and output categories. 
We also show that a parametric monitor AX . Af is a monitor for the parametric property 
AX . P, with P the property monitored by M. 

7.1. The Non-Parametric Case. We start by defining non-parametric monitors as a 
variant of (deterministic) Moore machine [29j that allows infinitely many states: 

Definition 15. A monitor M is a tuple {S,£,C,l,(t : S x £ ^ S,'y : S ^ C), where 5 is 
a set of states, £ is a set of input events, C is a set of output categories, t G S is the initial 
state, a is the transition function, and 7 is the output function. The transition function is 
extended to a : S x £* ^ S as expected: a"(s,e) = s and a{s,we) = a{a{s,w),e) for any 
s £ S, e G £, and w G £* . 

The notion of a monitor above is rather conceptual. Actual implementations of monitors 
need not generate all the state space apriori, but on a "by need" basis. Consider, for 
example, a monitor for a property specified using an NFA which performs an NFA-to-DFA 
construction on the fly, as events are received. Such a monitor generates only those states 
in the DFA that are needed by the monitored execution trace. Moreover, the monitor only 
needs to store one such state of the DFA, i.e., set of states in the NFA, namely the current 
one: once an event is received, the next state is (deterministically) computed and the old 
one is discarded. Therefore, assuming that one needs constant space to store a state of the 
original NFA, then the memory needed by this monitor is linear in the number of states 
of the NFA. An alternative and probably more conventional monitor could be one which 
generates the corresponding DFA statically, paying upfront the exponential price in both 
time and space. As empirically suggested by [35], if one is able to statically generate and 
store the corresponding DFA then one should most likely take this route, because in practice 
it tends to be much faster to jump to a known next state than to compute it. 

Allowing monitors with infinitely many states is a necessity in our context. Even though 
only a finite number of states is reached during any given (finite) execution trace, there is, in 
general, no bound on how many states are reached. For example, monitors for context-free 
grammars like the ones in |26j have potentially unbounded stacks as part of their state. 
Also, as shown shortly, parametric monitors have domains of functions as state spaces, 
which are infinite as well. Nevertheless, what is common to all monitors is that they can 
classify traces into categories. When a monitor does not have enough information about 
a trace to put it in a category of interest, we can assume that it actually categorizes it 
as a "don't know" trace, where "don't know" can be regarded as a special category; this 
is similar to regarding partial functions as total functions by adding a special "undefined" 
value in their codomain. The following is therefore natural: 

Definition 16. Monitor M = {S,£,C, L,a,'y) is a monitor for property P : £* ^ C if 

and only if 7((T(t, w)) = P{w) for each w € £*. 

A property can be associated to each monitor, in a similar style to which we can 
associate a language to each automaton: 
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Definition 17. Monitor M = {S,£,C, L,a,j) defines the M-property Vm : £* ^ C as 
follows: Vm{w) = 7((T(t, w)) for each w € 6*. 

The following result is straightforward, it follows immediately from Definitions 1161 and 
[T71 The only reason we frame it as a numbered proposition is because we need to refer to 
it in the proof of Corollary [5l 

Proposition 15. With the notation in Definition [T71 monitor M is indeed a monitor for 
its corresponding M-property Vm- Moreover, a monitor can only be a monitor for one 
property, that is, if M is a monitor for property P then P = Vm- 

Since we allow monitors to have infinitely many states, there is a strong correspondence 
between properties and monitors: 

Definition 18. Property P : £* ^ C defines the P-monitor A4p = {Sp,£,C, ip,ap,'yp) 
as follows: 

Sp = £*, 
Lp = e, 

(Jp{w, e) = we for each w £ Sp = £* and e £ £, 
jp{w) = P{w) for each w £ Sp = £* . 

Thus, M.pw holds traces as states, appends events to them as transition and, as output, 
it looks up the category of the corresponding trace using P. The following results are also 
straightforward and, again, we frame them as numbered propositions only because we will 
refer to them later. 

Proposition 16. With the notation in Definition 1181 the monitor A^p is indeed a monitor 
for property P. 

Proof. It follows from the sequence of equalities ■jp{ap{Lp,w)) = 'yp{ap{e,w)) = ^p{ew) = 
^p{w) = P{w). □ 

Proposition 17. With the notations in Definitions 1171 and 1181 "^Mp = P for any property 
P :£* ^C. 

Proof. VMpiw) = 'yp{ap{Lp,'w)) = P{w) for any w £ £*. □ 

The equality of monitors M-pj^j = M does not hold for any monitor M; it does hold 
when M = A4p for some property P, though. 

Definition 19. Monitors M and M' are property equivalent, or just equivalent, written 
M = M', iff they are monitors for the same property (see Definition I16p . With the notation 
in Definition [HI we have that M = M' iff Vm = Vm' ■ 

Proposition 18. With the notations in Definitions [T7] and [181 -^Vm = ^ fo^ ^'^y monitor 
M = (5,^,C,^a,7). 

Proof. By Definition [191 = ifi^ 'Pmv ~ '^M-, and the latter follows by Proposition 

[TU] taking P to be Pa/- " □ 
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7.2. The Parametric Case. We next define parametric monitors in the same style as 
the other parametric entities defined in this paper: starting with a base monitor and a set 
of parameters, the corresponding parametric monitor can be thought of as a set of base 
monitors running in parallel, one for each parameter instance. 

Definition 20. Given parameter set X with corresponding values V and a monitor M = 
(S, S, C, L, a : S X 8 ^ S, 7:6'^ C), we define the parametric monitor AX .M as 
the monitor 

{[[X^V]^S], £{X), [[X^V]^C], Xe.L, AX. a, AX. 7), 

with 

AX. a: [[X^V]^S] x £{X) [[X^V]^S] 
AX.7 : [[X^V]^S] [[X^V]^C] 

defined as 

(Ax..)(.,e(.'))(.)^{^(,f 

{AX.-f)(6m=j{5{9)) 
for any 6 G [[X^V]^S] and any 6,6' £ [X ^V]. 

Therefore, a state 6 of parametric monitor AX . M maintains a state 6{6) of M for each 
parameter instance 6, takes parametric events as input, and outputs categories indexed by 
parameter instances (one output category of M per parameter instance). 

Proposition 19. If Af is a monitor for P then parametric monitor AX . M is a monitor 
for parametric property AX . P, or, with the notation in Definition 1 17( Vax . M = AX . Vm- 

Proof. We show that (AX . 7)((AX . (j)(Ae.i, r)) = (AX.P)(r) for any r G ^(X)*, i.e., after 
application on 6* G [X ^V], that j{{AX . a){\6.L,T){e)) = Pirle) for any r G £{X)* and 
6 G [X^y]. Since M is a monitor for P, it suffices to show that {AX . a){X6.L,T){6) = 
a{L,T\g) for any r G £{X)* and 6 G [X^y]. We prove it by induction on r. If r = e then 
{AX .a){X6.i,e){6) = {\6.i){6) = i = a{i,e) = cj(t,eU). Suppose that {AX .a){\6.i,T){6) = 
a{L,T \g) for some arbitrary but fixed r G £{X)* and for any 6 G [X ^ F], and let e{6') 
be any parametric event in £{X) and let 9 G [X —r V] he any parameter instance. The 
inductive step is then as follows: 

(AX . a){X6.L, T e{d')){6) = (AX . cj)((AX . a){\d.i, r), e{6')){6) 

= {AX.a){a{i,T\e),e{6')){6) 
^ r a{a{i,T\e),e) \i6'\Z6 
\a{i,T\e) ii6'%6 
_ J (j{i,T\e e) if 6*' □ 6* 
-\a{i,T\e) if 6' ^6 
= a{L, {re{e'))\e) 

The first equality above follows by the second part of Definition [TSll . the second by the 
induction hypothesis, the third by Definition [20l the fourth again by the second part of 
Definition [T5\ and the fifth by Definition [TTJ This concludes our proof. □ 
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Algorithm B(X)(M = (5, C, o", 7)) 

Input: finite parametric trace r G £{X)* 

Output: mapping T : [[X ^V]^C\ and set G C [X^F] 

1 A ^ _L; A(±) ^ e ^ {±} 

2 ioreach parametric event e{9) in order in r do 
3 
4 
5 
6 
7 



foreach 9' G {6*} U 6 do 
A(0') ^ (j(A(max(0']e),e) 

r(6'') ^ 7(A(0')) //a message may be output here 
endfor 

e^{±,0}ue 



8 endfor 

Figure 3: Parametric monitoring algoritlim M{X) 



8. Algorithms for Parametric Trace Monitoring 

We next propose two monitoring algoritlims for parametric properties. Our unoptimized but 
easier to understand algoritlim is easily derived from the parametric trace slicing algorithm 
in Figure [2j Our second algorithm is an online optimization of the first, which significantly 
reduces the size of the search space for compatible parameter instances when a new event 
is received. 



8.1. Unoptimized but Simpler Algorithm. Analyzing the definition of a parametric 
monitor (Definition I20p . the first thing we note is that its state space is not only infinite, but 
it is not even enumerable. Therefore, a first challenge in monitoring parametric properties is 
how to represent the states of the parametric monitor. Inspired by the algorithm for trace 
slicing in Figure O we encode functions [[X^y] S] as tables with entries indexed by 
parameter instances in [X —r V] and with contents states in S. Following similar arguments 
as in the proof of the trace slicing algorithm, such tables will have a finite number of entries 
provided that each event instantiates only a finite number of parameters. 

Figure [3] shows our monitoring algorithm for parametric properties. Given parametric 
property KX . P and M a monitor for P, M{X){M) yields a monitor that is equivalent to 
AX . M, that is, a monitor for KX . P. Section [9] shows one way to use this algorithm: a 
monitor M is first synthesized from the base property P, then that monitor M is used to 
synthesize the monitor M{X){M) for the parametric property AX . P. M{X){M) follows very 
closely the algorithm for trace slicing in Figure [2l the main difference being that trace slices 
are processed, as generated, by M: instead of calculating the trace slice of 9' by appending 
base event e to the corresponding existing trace slice in line 4 of A(A), we now calculate 
and store in table A the state of the monitor instance corresponding to 9' by sending e 
to the corresponding existing monitor instance (line 4 in M{X){M)); at the same time we 
also calculate the output corresponding to that monitor instance and store it in table F. 
In other words, we replace trace slices in A{X) by local monitors processing online those 
slices. In our implementation in Section [9l we also check whether r(^') at line 5 violates the 
property and, if so, an error message including 9' is output to the user. 
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Definition 21. Given r G £{X)*, let B(X)(M)(T).e and B(X)(M)(t).A andM{X){M){e).V 
be the three data-structures maintained by the algorithm M{X) (M) in Figure [3] after process- 
ing T. Let ± i = B(X)(M)(e).A G [[X^V]^S] be the partial map taking _L G [X ^V] 
to t and undefined elsewhere. 

Corollary 4. The following hold for any r G £{X)*: 

(1) Dom(B(X)(M)(r).A) = B(X)(M)(r).G = 9^; 

(2) M{X){M){T).A{e) =a{L,T\e) and 

M{X){M){T).T{e) = -fia{L,T\0)) for any 9 G B(X)(M)(r).e; 

(3) a(i,rr,) =B(X)(M)(T).A(max(0]B(x>(M)(r).e) and 

7(cT(^Tre)) =B(X)(M)(r).r(max(0]B(x)(M)(r).e) for any 6 G [X^F]. 

Proof. Follows from Theorem [T] and the discussion above. □ 

We next show how to associate a formal monitor to the algorithm M{X){M) in FigureO 
Definition 22. For the algorithm M{X){M) in Figure O let 

Mm(^x)(m) = {R,£{X), [[X^V]-^C],±^ L, next, out) 
be the monitor defined as follows: 

• RC [[X^V]^S] is the set 

{M{X){M){t).A \ t££{X)*} 

of reachable A's in B(X)(M), and 

• next : R x £{X) — > R and out : R [[X ^V] —>C] are functions defined as follows, where 
T G £{X)*, e G ^, and G [X^V]: 

next{M{X){M){T).A, e{9)) = M{X){M){t e{e)).A, and 
out{M{X)iM)iT).A)ie) = B(X)(M)(r).r(max(e]B(x>(M)(r).e)- 

Theorem 2. A4b(x)(m) = ^-'^ • M for any monitor M. 

Proof. All we have to do is to show that, for any r G £{X)*, out{next(l. i— ?• r)) and 
{AX .-f){{AX .a){X0.i,T)) are equal as total functions in [[X^V]^C]. Let ^ G [X^F]; 
then: 

oui(ne2;i(± ^ i, r))(0) = oui(B(X)(M)(r).A)(0) 

= B(X)(M)(r).r(max(0]B(x){M)(r).0) 

= 7((AX.a)(A0.i,r)(0)) 

= (AX.7)((AX.a)(Ae..,r))(0). 

The first equality above follows inductively by the definition of next (Definition [22|) . noticing 
that _L 1-^ t = B(X)(M)(e).A. The second equality follows by the definition of out (Defini- 
tion [22|) and the third by 3. in Corollary The fourth equality above follows inductively by 
the definition of AX . a (Definition [20]) and has already been proved as part of the proof of 
Proposition 1191 Finally, the fifth equality follows by the definition of AX . 7 (Definition I20p. 
Therefore, A1b(x)(m) and AX . M define the same property. □ 
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Algorithm C{X) (M = (S, £,C,l, a, 7)) 
Globals: mapping A : [[X^y]^^] and 

mapping : [X ^V] ^ VfiiX ^V]) and 

mapping T : [[X^V]^C] 
Initialization: Z^(6') ^ for any 9 G [X^V],A{±) ^ 



function main(e(0)) 
1 if A(6')undefined then 

foreach Omax C 9 (in reversed topological order) do 
\i A{9max) defined then 
: goto 7 
endif 
endfor 

defineTo(6', 6*^01) 

foreach 9max C 9 (in reversed topological order) do 
foreach 9comp G U{9max) that is compatible with 9 do 
if A{9comp U 0) undefined then 
: defineTo(6'comp U 9, 9comp) 
endif 
endfor 
endfor 

15 endif 

16 foreach 9' € {9} U U{9) do 
A{9')^cj{A{9'),e) 
T{9') ^ a{A{9')) 



2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 



17 
18 



19 endfor 



function defineTo(0, 9') 

1 A(^) ^ A(0') 

2 foreach 9" C 6* do 

3 ; U{9") ^ U{9") U {61} 

4 endfor 



Figure 4: Online parametric monitoring algorithm C(X) 



Corollary 5. If M is a monitor for P and X is a set of parameters, then A^b(a:)(m) is a 
monitor for parametric property AX . P. 

Proof. With the notation in Definition[T7l Theorem [2] implies that 'Pa4b(x){m) ~ ^AX . M- By 
Proposition [19] we have that Vax .M = AX .Pm- Finally, since P = Vm by Proposition [T5| 
we conclude that 'Pmm{x)(m) ~ • ^- D 



8.2. Optimized Algorithm. Algorithm C{X) in Figure H] refines Algorithm B(A') in Fig- 
ure [3] for efficient online monitoring. Since no complete trace is given in online monitoring, 
C{X) focuses on actions to carry out when a parametric event e{9) arrives; in other words, 
it essentially expands the body of the outer loop in M{X) (lines 3 to 7 in Figure [3]). The 
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direct use of M{X) would yield prohibitive runtime overhead when monitoring large traces, 
because its inner loop requires search for all parameter instances in Q that are compatible 
with 6; this search can be very expensive. C{X) introduces an auxiliary data structure and 
illustrates a mechanical way to accomplish the search, which also facilitates further opti- 
mizations. While M{X) did not require that 9 in e{6) be of finite domain, C{X) needs that 
requirement in order to terminate. Note that in practice Dom(0) is always finite (because 
the program state is finite). 

C{X) uses three tables: A, U and F. A and F are the same as A and F in M{X), 
respectively. U is an auxiliary data structure used to optimize the search "for all 9' € 
{9} U 0" in M{X) (line 3 in Figure [3|). It maps each parameter instance 9 into the finite 
set of parameter instances encountered in A so far that are strictly more informative than 
9, i.e., U{9) = {9' \ 9' G Dom(A) and 9 C 9'}. Another major difference between M{X) 
and C(X) is that C(X) does not maintain during computation; instead, is implicitly 
captured by the domain of A in C(X). Intuitively, the at the beginning/end of the 
body of the outer loop in M{X) is the Dom(A) at the beginning/end of C(X), respectively. 
However, is fixed during the loop at lines 3 to 6 in B(X) and updated atomically in line 7, 
while Dom(A) can be changed at any time during the execution of C{X). 

C{X) is composed of two functions, main and defineTo. The defineTo function takes two 
parameter instances, 9 and 9', and adds a new entry corresponding to 9 into A and U. 
Specifically, it sets A{9) to A{9') and adds 9 into the set U{9") for each 9" C 9. 

The main function differentiates two cases when a new event e{9) is received and pro- 
cessed. The simpler case is that A is already defined on 9, i.e., E at the beginning of the 
iteration of the outer loop in M{X). In this case, {6*} U = {9' j 6*' G and 6^ C 9'} C 0, 
so the lines 3 to 6 in M{X) become precisely the lines 16 to 19 in C{X). In the other case, 
when A is not already defined on 9, main takes two steps to handle e. The first step searches 
for new parameter instances introduced by {9} U and adds entries for them into A (lines 2 
to 14). We first add an entry to A for 9 at lines 2 to 7. Then we search for all parameter 
instances 9comp that are compatible with 9, making use oi U (lines 8 and 9); for each such 
9comp, an appropriate entry is added to A for its lub with 9, and U updated accordingly 
(lines 10 to 12). This way, A will be defined on all the new parameter instances introduced 
by {9} U after the first step. In the second step, the related monitor states and outputs 
are updated in a similar way as in the first case (lines 16 to 19). It is interesting to note 
how C{X) searches at lines 2 and 8 for the parameter instance max{9]Q that M{X) refers 
to at line 4 in Figure [3l it enumerates all the 9max iZ in reversed topological order (from 
larger to smaller); 1. in Proposition [8] guarantees that the maximum exists and, since it is 
unique, our search will find it. 

Correctness of C{X). We prove the correctness of C{X) by showing that it is equivalent 
to the body of the outer loop in M{X). Suppose that parametric trace r has already been 
processed by both C{X) and M{X), and a new event e{9) is to be processed next. 

Let us first note that C{X) terminates if Dom(0) is finite. Indeed, if Dom{9) is finite 
then there is only a finite number of partial maps less informative than 9, that is, only a 
finite number of iterations for the loops at lines 2 and 8 in main; since U is only updated at 
line 3 in defineTo, U{9) is finite for any € [X^y] and thus the loop at line 9 in main also 
terminates. Assuming that running the base monitor M takes constant time, the worst case 
complexity of C(X)(M) is 0{k x /) to process e{9), where k is 2l^°'^(^)l and / is the number 
of incompatible parameter instances in r. Parametric properties often have a fixed and 
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small number of parameters, in which case k is not significant. Depending on the trace, I 
can unavoidably grow arbitrarily large; in the worst case, each event may carry an instance 
incompatible with the previous ones. 

Lemma 1. In the algorithm C{X) in Figured U{9) = {9' | 6' G Dom(A) and 9 C 9'} before 
and after each execution of defineTo, for all 9 € [X^y]. 

Proof. By how C{X) is initialized, for any 9 e [X ^ V] we have = U{9) = {9' \ 9' G 
Dom(A) and 9 IZ 9'} before the first execution of defineTo. Now suppose that V({9) = {9' \ 
9' € Dom(A) and 9 IZ 9'} for any 9 & [X ^ V] before an execution of defineTo and show 
that it also holds after the execution of defineTo. Since defineTo(^, 9') adds a new parameter 
instance 9 into Dom(A) and also adds 9 into the set U{9") for any 9" G [X^F] with 9" IZ 9, 
we still have U{9) = {9' \ 9' e Dom(A) and 9 Z 9'} for any 9 e [X^V] after the execution 
of defineTo. Also, the only way C{X) can add a new parameter instance 9 into Dom(A) is 
by using defineTo. Therefore the lemma holds. □ 

The next theorem proves the correctness of C{X). Before we state and prove it, let us 
recall some previously introduced notation and also introduce some new useful notation. 
First, recall from Definition ED that M{X){M){t).A and M{X){M){t).T are the A and F 
data-structures oiM{X){M) after it processes trace r. Also, recall that we fixed parametric 
trace r and event e{9). For clarity, let Uq, Ac, and Fc be the three data-structures main- 
tained by C{X)(M) (in other words, we index the data-structures with the symbol C). Let 
A^ and F^ be the Ac and Fc when main(e(^)) begins ("&" stays for "at the beginning"); let 
A^ and F^ be the Ac and Fc when main(e(0)) ends ("e" stays for "at the end"; and let A™ 
and be the Ac and Uc when main(e(0)) reaches line 16 ("m stays for "in the middle"). 

Theorem 3. The following hold: 

(1) Dom(AJ^') = {±,0}UDom(A^); 

(2) A™(0') = A^(max(0']Do^(^.)), for all 0'GDom(A-); 

(3) If A^ = B(X)(M)(r).A and F^ = ]B(X)(M)(t).F, then = M{X){M){t e{9)).A and 
F^=B(X)(M)(re(0)).F. 

Proof. Let Qc = Dom(A^) = Dom(AB(r)) and Ab{t) = M{X){M){t{9)).A for simplicity. 

1. There are two cases to analyze, depending upon whether 9 is in Qc or not. If ^ € ©c 
then the lines 2 to 14 are skipped and Dom(Ac) remains unchanged, that is, {-L,^} U ©c = 
©C = Donn(A^) = Dom(A^) when main(e(6')) reaches line 16. If ^ ©c then lines 2 to 14 
are executed to add new parameter instances into Dom(Ac). First, an entry for 9 will be 
added to Ac at line 7. Second, an entry for 9comp U 9 will be added to Ac at line 11 (if Ac 
not already defined on Oqqyyip I—I ^) GVGntu3<lly for Biiy O^Q^y^p G compatible with. Oi that is 
because O^nax can also be J_ at line 8, in which case Lemma [T] implies that l^{OYnax ) = ©c- 
Therefore, when line 16 is reached, Dom(A^) is defined on all the parameter instances in 
{9} U {{9} U ©c). Since _L G ©c, the latter equals {9} U ©c, and since A^ remains defined 
on ©c, we conclude that A™ is defined on all instances in ({6*} U ©c) U ©c, which by 5. 
and 7. in Proposition [4] equals {-\-,9} U ©c. 

2. We analyze the same two cases as above. If G ©c then lines 2 to 14 are skipped and 
Dom(Ac) remains unchanged. Then max(0']ef. = 9' for each 9' € Dom(A™), so the result 
follows. Suppose now that 9 ^ ©c. By 1. and its proof, each 9' € Donn(Aj^^) is either 
in ©c or otherwise in {{9} U ©c) — ©c- The result immediately holds when 9' G ©c as 
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max {9']e^ = 6' and A(6'') stays unchanged until line 16. If 0' G {{6} U Be) - Oc then A((9') 
is set at either line 7 {6' = 6) or at line 11 (0' ^ 6): 

(a) For line 7, the loop at lines 2 to 6 checks all the parameter instances that are 
less informative than 9 to find the first one in Qq in reversed topological order (i.e., if 
01 C 02 then 02 will be checked before 0i). Since by 1. in Proposition [8] we know that 
max(6']0j. € 0c exists (and it is unique), the loop at lines 2 to 6 will break precisely when 
0max = niax (0]ec) so the result holds when 0' = because of the entry introduced for in 
Ac at line 7 and because the remaining lines 8 to 14 do not change Ac(^). 

(b) When Ac(^') is set at line 11, note that the loop at lines 8 to 14 also iterates over 

all 

0max IZ in reversed topological order, so 0' — 0comp U for some 0comp ^ ©c compatible 
with such that 0max C 0comp, where 0max C is such that there is no other 0'fnax with 
0max C 0max ^ ^ ^' = ^'comp ^ ^ some 0comp ^ ®c Compatible with such that 
0'max ^ ^'comp- We claim that there is only one such 0comp, which is precisely max(^']ec: 
Let O'comp be the parameter instance max {0']ec- The above implies that 0comp E 0comp E 
Also, 0'comp U0 = 0' because 0' = 0comp U C 0'comp ^0 Q 0'- Let 0'max be 0comp n ^! that is, 
the largest with 0'^^^^ Q ^comp ^max — ^ its existence as exercise) . It is relatively 

easy to see now that 0comp C 0comp implies 0max C 0'max ^^t it as an exercise, too), 
which contradicts the assumption of this case that Ac was not defined on 0'. Therefore, 
0comp = max(^']ec before line 11 is executed, which means that, after line 11 is executed, 
^c(^') = Ac(max (0']ec)j moreover, none of these will be changed anymore until line 16 is 
reached, which proves our result. 

3. Since T is updated according to A in both C{X) and M{X), it is enough to prove that 
A^ = AB(re). For M{X), we have 

1) Dom(AB(re)) = {±, 0} U Oc = {{0} U Be) U Oc; 

2) V 0' G {0} U Gc, AB(Te)(^') = a(AB(r)(max, (0']ej, e); 

3) y0'eQc- {0} U Gc, AB(re)(0') = Am{t){0'). 
So we only need to prove that 

1) Dom(A9 = {±,0}uec; 

2) y0'e {0} U Gc, A^(0') = (T(A^(max, {0']^^), e); 

3) V 0' G Gc - {0} U Gc, Al{0') = A^(^')- 

By i., we have Dom(A^) = {-\-,0} U Gc- Since lines 16 to 19 do not change Dom(Ac), 
Dom(A^) = Dom(A^) = {±, 0} U Gc- 1) holds. 

By 2. and Lemma E A^(6'') = A^(max, (6'']ec) for any 0' G Dom(A^). Also, notice 
that line 17 sets Ac(^') to cr(Ac(^')) which is (T(A^(max, {0']q^), e), for the 0' in the loop. 
So, to show 2) and 3), we only need to prove that the loop at line 16 to 19 iterates over 
{9} U Gc. Since lines 16 to 19 do not change Uc, we need to show {e}UU^{e) = {0} U Gc- 
Since Dom(A™) = {±,0} U Gc, we have {0} U Dom(A^) = {0} U {{±,0} U Gc) = {0} U 
{{{0} U Gc) U Gc). By Proposition!! {0} U Dom(A™) = {{0} U {{0} U Gc)) U {{0} U Gc) = 
{{0} U Gc) U ({0} U Gc) = {^} U Gc. Also, as ^ G Dom( AJ^') , we have {0} U Dom( AJ?) = {0' \ 
0' G Dom(A™) and Q 0'} = {0}Ul(^{0) by Lemma H So {0}UU^'{0) = {0} U Gc- □ 

We conclude this section with a discussion on the complexity of the parametric moni- 
toring algorithms A(X) and C{X) above. Note that, in the worst case, to process a newly 
received parametric event e{0) after A{X) or C{X) has already processed a parametric trace 
r, each of A{X) or C{X) takes at least linear time/space in the number of 0-compatible 
parameter instances occurring in events in r. Indeed, A(X) iterates explicitly through all 
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such parameter instances (line 3 in Figure [3j), while C{X) optimizes this traversal by only 
enumerating through maximal parameter instances; in the worst case, we can assume that 
r is such that each event comes with a new parameter instance which is maximal, so in the 
worst case A(X) and C{X) can take linear time/space in r to process e{9), which is, nev- 
ertheless, bad. Indeed, it means that monitoring some traces is incrementally slower (with 
no upper bound) as events are received, until the monitor eventually runs out of resources. 

Unfortunately, there is nothing to be fundamentally done to avoid the problem above. 
It is an inherent problem of parametric monitoring. Consider, for example, the "authen- 
ticate before use" parametric property specified in Section 12.21 using parametric LTL as 
Ak . □(use(A;) — )• <=>authenticate(/c)); to make it clear that events depend on the parameter key 
k, we tagged them with the key. Without any knowledge about the semantics of the pro- 
gram to monitor, any monitor for this property must store all the authenticated keys, i.e., 
all the instances of the parameter k. Indeed, without that, there is no way to know whether 
a key instance has been authenticated or not when a use event is observed on that key. The 
number of such key instances is theoretically unbounded, so, in the worst case, any monitor 
for this property can be incrementally slower and eventually run out of resources. 

As seen in Section [U the runtime overhead due to opaque monitoring of parametric 
properties tends to be manageable in practice. By "opaque" we mean that no semantic 
information about the source code of the monitored program is used. If the lack of an 
efficiency guarantee is a problem in some applications, then the alternative is to statically 
analyze the monitored program and to use the obtained semantic information to eliminate 
the need for monitoring. For example, static analyses like those in [25^ \T0\ [T2| [18] may 
significantly reduce the need for instrumentation, even eliminate it completely. Moreover, 
model-checking techniques for parametric properties could also be used for actually proving 
that the properties hold and thus they need not be monitored; however, we are not aware 
of model checking approaches to verifying parametric properties as presented in this paper. 

9. Implementation in JavaMOP and RV 

The discussed parametric monitoring technique is now fully implemented in two runtime 
verification systems, namely in JavaMOP (see http : // j avamop . org) and in RV [27] (devel- 
oped by a startup company. Runtime Verification, Inc.; the RV system is currently publicly 
unavailable - contact the first author for an NDA-protected version of RV) . Here we first in- 
formally discuss several optimizations implemented in the two runtime verification systems, 
and then we discuss our experiments and the evaluation of the two systems. 

9.1. Implementation Optimizations. Both JavaMOP and RV apply several optimiza- 
tions to the algorithm C{X) in Section [8.21 to reduce its runtime overhead. These are not 
discussed in depth here, because they are orthogonal to the main objective of this paper. 

9.1.1. Optimizations in JavaMOP. Note that C{X) iterates through all the possible pa- 
rameter instances that are less informative than 6 in three different loops: at lines 2 and 8 
in the main function, and at line 2 in the defineTo function. Hence, it is important to reduce 
the number of such instances in each loop. Even though our semantics and theoretical 
algorithms for parametric monitoring in this paper work with infinite sets of parameters, 
our current implementation in JavaMOP assumes that the set of parameters X is bounded 
and fixed apriori (declared as part of the specification to monitor). A simple analysis of the 
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events appearing in the specification allows to quickly detect parameter instances that can 
never appear as lubs of instances of parameters carried by events; maintaining any space 
for those in A, or F, or iterating over them in the above mentioned loops, is a waste. For 
example, if a specification contains only two event definitions, ei(pi) and 62(^1,^2), param- 
eter instances defining only parameter p2 can never appear as lubs of observed parameter 
instances. A static analysis of the specification, discussed in [111 [10], exhaustively explores 
all possible event combinations that can lead to situations of interest to the property, such 
as to violation, validation, etc. Such information is useful to reduce the number of loop 
iterations by skipping iterations over parameter instances that cannot affect the result of 
monitoring. These static analyses are currently used at compile time in our new JavaMOP 
implementation to unroll the loops in C{X) and reduce the size of A and U. 

Another optimization is based on the observation that it is convenient to start the 
monitoring process only when certain events are received. Such events are called monitor 
creation events in [14] . The parameter instances carried by such creation events may also 
be used to reduce the number of parameter instances that need to be considered. An ex- 
treme, yet surprisingly common case is when creation events instantiate all the property 
parameters. In this case, the monitoring process does not need to search for compatible 
parameter instances even when an event with an incomplete parameter instance is observed. 
The old JavaMOP pL4j supported only traces whose monitoring started with a fully instan- 
tiated monitor creation event; this was perceived (and admitted) as a performance-tradeoff 
limitation of JavaMOP (and pLl]). Interestingly, it now becomes just a common-case 
optimization of our novel, general and unrestricted technique presented here. 

9.1.2. Optimizations in RV. The RV system implements all the optimizations in JavaMOP 
and adds two other important optimizations that significantly reduce the overhead. 

The first additional optimization of RV is a non-trivial garbage-collector [20j. Note 
that JavaMOP also has a garbage collector, but it only does the obvious: it garbage collects 
a monitor instance only when all its corresponding parameter instances are collected. RV 
performs a static analysis of the property to monitor and, based on that, it garbage collects 
a monitor instance as soon as it realizes that it can never trigger in the future. This can 
happen when any triggering behavior needs at least one event that can only be generated 
in the presence of a parameter instance that is already dead. Consider, for example, the 
safe iterator example in Section 12.31 and consider that iterator 27 is created for collection 
C3. Then a monitor instance corresponding to the parameter instance (cs zy) is created and 
manged. Suppose that, at some moment, the iterator is garbage collected by the JVM. 
Can the monitor instance corresponding to (cs ij) be garbage collected? Not in JavaMOP, 
because, for safety, JavaMOP collects a monitor only when all its parameter instances are 
collected, and in this case C3 is still alive. However, this monitor is flagged for garbage 
collection in RV. The rationale for doing so is that the only way for the monitor to trigger 
is to eventually encounter a next event with as parameter, but that event can never be 
generated because is dead. Note, on the other hand, that the monitor (cs ij) cannot be 
garbage collected if C3 is collected but ij is still alive, because the iterator alone can still 
violate the safe-iterator property, even if its corresponding collection is already dead. 

Runtime verification systems like JavaMOP and Tracematches use off-the-shelf weak 
reference libraries to implement their garbage collectors. However, it turns out that these 
libraries, in order to be general and thus serve their purpose, perform many checks that 
are unnecessary in the context of monitoring. The second optimization of RV in addition 
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to those of JavaMOP consists of a collection of data structures based on weak references, 
which was carefully engineered to take full advantage of the particularities of monitoring 
parametric properties. These data structures allow for effective indexing and lazy collection 
of monitors, to minimize the number of expensive traversals of the entire pool of monitors. 
Moreover, RV caches monitor instances to save time when the same monitor instances are 
accessed frequently. For example, there is a high chance that the same iterator is accessed 
several times consecutively by a program, in which case saving and then retrieving the same 
corresponding monitor instances from the data-structures at each iterator access can take 
considerable unnecessary overhead. 

9.2. Experiments and Evaluation. We next discuss our experience with using the two 
runtime verification systems that implement optimized variants of the parametric property 
monitoring techniques describe in this paper. Also, we compare their performance with that 
of Tracematches, which is, at our knowledge, the most efficient runtime verification system 
besides JavaMOP and RV. Recall that Tracematches achieves virtually the same semantics 
of parametric monitoring like ours, but using a considerably different approach. 

9.2.1. Experimental Settings. We used a Pentium 4 2.66GHz / 2GB RAM / Ubuntu 9.10 
machine and version 9.12 of the DaCapo (DaCapo 9.12) benchmark suite [8], the most up- 
to-date version. We also present some of the results of our experiments using the previous 
version of DaCapo, 2006-10 MR2 (DaCapo 2006-10), namely those for the bloat and jython 
benchmarks. DaCapo 9.12 does not provide the bloat benchmark from the DaCapo 2006-10, 
which we favor because it generates large overheads when monitoring iterator-based prop- 
erties. The bloat benchmark with the UnsafeIter specification causes 19194% runtime 
overhead (i.e., 192 times slower) and uses 7.7MB of heap memory in Tracematches, and 
causes 569% runtime overhead and uses 147MB in JavaMOP, while the original program 
uses only 4.9MB. Also, although the DaCapo 9.12 provides jython, Tracematches cannot 
instrument jython due to an error that we were not able to understand or fix. Thus, we 
present the result of jython from the DaCapo 2006-10. The default data input for DaCapo 
was used and the -converge option to obtain the numbers after convergence within ±3%. In- 
strumentation introduces a different garbage collection behavior in the monitored program, 
sometimes causing the program to slightly outperform the original program; this accounts 
for the negative overheads seen in both runtime and memory. 

We used the Sun JVM 1.6.0 for the entire evaluation. The Aspect J compiler (ajc) 
version 1.6.4 is used for weaving the aspects generated by JavaMOP and RV into the target 
benchmarks. Another AspectJ compiler, abc [3] 1.3.0, is used for weaving Tracematches 
properties because Tracematches is part of abc and does not work with ajc. For JavaMOP, 
we used the most recent release version, 2.1.2. For Tracematches, we used the most recent 
release version, 1.3.0, from [36], which is included in the abc compiler as an extension. 
To figure out the reason that some examples do not terminate when using Tracematches, 
we also used the abc compiler for weaving aspects generated by JavaMOP and RV. Note 
that JavaMOP and RV are AspectJ compiler-independent. They show similar overheads 
and terminate on all examples when using the abc compiler for weaving as when ajc is 
used. Because the overheads are similar, we do not present the results of using abc to 
weave JavaMOP and RV generated aspects in this paper. However, using abc to weave 
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Figure 5: Comparison of Tracematches (TM), JavaMOP (MOP), and RV: (A) average per- 
cent runtime overhead; (B) total peak memory usage in MB. (convergence within 
3%, oo: not terminated after 1 hour) 



JavaMOP and RV properties confirms that the high overhead and non-termination come 
from Tracematches itself, not from the abc compiler. 

The following properties are used in our experiments. Some of them were already 
discussed in Sections 11.11 and [21 others are borrowed from [lOl [TTl [261 [13] ■ 

• HasNext: Do not use the next element in an iterator without checking for the existence 
of it; 

• UnsafeIter: Do not update a collection when using the iterator interface to iterate its 
elements; 

• UnsafeMapIter: Do not update a map when using the iterator interface to iterate its 
values or its keys; 

• UnsafeSyncColl: If a collection is synchronized, then its iterator also should be ac- 
cessed synchronously; 

• UnsafeSyncMap: If a collection is synchronized, then its iterators on values and keys 
also should be accessed synchronized. 

All of them are tested on Tracematches, JavaMOP, and RV for comparison. We also 
monitored all five properties at the same time in RV, which was not possible in other 
monitoring systems for performance reasons or structural limitations. 
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Figure 6: Monitoring statistics: number of events (E), number of created monitors (M), 
number of flagged monitors (FM), number of collected monitors (CM). 



9.2.2. Results and Discussions. Figures [5] and [6] summarize the results of the evaluation. 
Note that the structure of the DaCapo 9.12 allows us to instrument all of the benchmarks 
plus all supplementary libraries that the benchmarks use, which was not possible for Da- 
Capo 2006-10. Therefore, fop and pmd show higher overheads than the benchmarks using 
DaCapo 2006-10 from [13j. While other benchmarks show overheads less than 80% in 
JavaMOP, bloat, avrora, and pmd show prohibitive overhead in both runtime and memory 
performance. This is because they generate many iterators and all properties in this eval- 
uation are intended to monitor iterators. For example, bloat creates 1,625,770 collections 
and 941,466 iterators in total while 19,605 iterators coexist at the same time at peak, in 
an execution, avrora and pmd also create many collections and iterators. Also, they call 
hasNextO 78,451,585 times, 1,158,152 times and 4,670,555 times and next() 77,666,243 times, 
352,697 times and 3,607,164 times, respectively. Therefore, we mainly discuss those three 
examples in this section, although RV shows improvements for other examples as well. 

Figure [5] (A) shows the percent runtime overhead of Tracematches, JavaMOP, and RV. 
Overall, RV averages two times less runtime overhead than JavaMOP and orders of mag- 
nitude less runtime overhead than Tracematches (recall that these are the most optimized 
runtime verification systems). With bloat, RV shows less than 260% runtime overhead for 
each property, while JavaMOP always shows over 440% runtime overhead and Tracematches 
always shows over 1350% for completed runs and crashed for UnsafeMapIter. With 
avrora, on average, RV shows 62% runtime overhead, while JavaMOP shows 139% runtime 
overhead and Tracematches shows 203% and hangs for UnsafeMapIter. With pmd, on 
average, RV shows 94% runtime overhead, while JavaMOP shows 231% runtime overhead 
and Tracematches shows 1139% and hangs for UnsafeMapIter and UnsafeSyncMap. 

Also, RV was tested with all five properties together and showed 982%, 275%, and 
620% overhead, respectively, which are still faster or comparable to monitoring one of 
many properties alone in JavaMOP or Tracematches. The overhead for monitoring all the 
properties simultaneously can be slightly larger than the sum of their individual overheads 
since the additional memory pressure makes the JVM's garbage collection behave differently. 

Figure [5] (B) shows the peak memory usage of the three systems. RV has lower peak 
memory usage than JavaMOP in most cases. The cases where RV does not show lower peak 
memory usage are within the limits of expected memory jitter. However, memory usage of 
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RV is still higher than the memory usage of Tracematches in some cases. Tracematches has 
several finite automata specific memory optimizations [4J, which cannot be implemented 
in formalism-independent systems like RV and JavaMOP. Although Tracematches is some- 
times more memory efficient, it shows prohibitive runtime overhead monitoring bloat and 
pmd. There is a trade-off between memory usage and runtime overhead. If RV more ac- 
tively removes terminated monitors, memory usage will be lower, at the cost of runtime 
performance. Overall, the monitor garbage collection optimization in RV achieves the most 
efficient parametric monitoring system with reasonable memory performance. 

Figure [6] shows the number of triggered events, of created monitors, of monitors flagged 
as unnecessary by RV's optimization, and of monitors collected by the JVM. Among the 
DaCapo examples, bloat, avrora, h2, pmd and sunfiow generated a very large number of 
events (millions) in all properties, resulting in millions of monitors created in most cases. 
h2 does not exhibit large overhead because monitor instances in h2 have shorter lifetimes, 
therefore the created monitor instances are not used heavily like in bloat, sunfiow has mil- 
lions of events but does not create as many monitor instances as as other benchmarks. When 
monitoring the HasNext and UnsafeIter properties, RV's garbage collector effectively 
fiagged monitors as unnecessary and most were collected by the JVM. 

The experimental evaluation in this section shows that the approach to parametric 
trace slicing and monitoring discussed in this paper is indeed feasible, provided that it is 
not implemented naively. Indeed, as seen in the tables in this section, implementation op- 
timizations make a huge difference in the runtime and memory overhead. This paper was 
not dedicated to optimizations and implementations; its objective was to only introduce the 
mathematical notions, notations, proofs and abstract algorithms underlying the semantical 
foundation of parametric properties and their monitoring. Current and future implementa- 
tions are and will build on this foundation, applying specific optimizations and heuristics 
to reduce the runtime or the memory overhead caused by monitoring. 

10. Concluding Remarks, Future Work and Acknowledgments 

A semantic foundation for parametric traces, properties and monitoring was proposed. A 
parametric trace slicing technique, which was discussed and proved correct, allows the ex- 
traction of all the non-parametric trace slices from a parametric slice by traversing the 
original trace only once and dispatching each parametric event to its corresponding slices. 
It thus enables the leveraging of any non-parametric, i.e., conventional, trace analysis tech- 
niques to the parametric case. A parametric monitoring technique, also discussed and 
proved correct, makes use of it to monitor arbitrary parametric properties against para- 
metric execution traces using and indexing ordinary monitors for the base, non-parametric 
property. Optimized implementations of the discussed techniques in JavaMOP and RV 
reveal that their generality, compared to the existing similar but ad hoc and limited tech- 
niques in current use, does not come at a performance expense. Moreover, further static 
analysis optimizations like those in [251 [T0| [T2l [18] may significantly reduce the runtime 
and memory overheads of monitoring parametric properties based on the techniques and 
algorithms discussed in this paper. 

The parametric trace slicing technique in Section [6] enables the leveraging of any non- 
parametric, i.e., conventional, trace analysis techniques to the parametric case. We have 
only considered monitoring in this paper. Another interesting and potentially rewarding 
use of our technique could be in the context of property mining. For example, one could run 
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the trace slicing algorithm on large benchmarks making intensive use of library classes, and 
then, on the obtained trace slices corresponding to particular classes or groups of classes of 
interest, run property mining algorithms. The mined properties, or the lack thereof, may 
provide insightful formal documentation for libraries, or even detect errors. Preliminary 
steps in this direction are reported in [22j . 
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